Welcome to the fourth project of the Machine Learning Engineer Nanodegree! In this notebook, template code has already been provided for you to aid in your analysis of the Smartcab and your implemented learning algorithm. You will not need to modify the included code beyond what is requested. There will be questions that you must answer which relate to the project and the visualizations provided in the notebook. Each section where you will answer a question is preceded by a 'Question X' header. Carefully read each question and provide thorough answers in the following text boxes that begin with 'Answer:'. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide in agent.py.
Note: Code and Markdown cells can be executed using the Shift + Enter keyboard shortcut. In addition, Markdown cells can be edited by typically double-clicking the cell to enter edit mode.
In this project, you will work towards constructing an optimized Q-Learning driving agent that will navigate a Smartcab through its environment towards a goal. Since the Smartcab is expected to drive passengers from one location to another, the driving agent will be evaluated on two very important metrics: Safety and Reliability. A driving agent that gets the Smartcab to its destination while running red lights or narrowly avoiding accidents would be considered unsafe. Similarly, a driving agent that frequently fails to reach the destination in time would be considered unreliable. Maximizing the driving agent's safety and reliability would ensure that Smartcabs have a permanent place in the transportation industry.
Safety and Reliability are measured using a letter-grade system as follows:
| Grade | Safety | Reliability |
|---|---|---|
| A+ | Agent commits no traffic violations, and always chooses the correct action. |
Agent reaches the destination in time for 100% of trips. |
| A | Agent commits few minor traffic violations, such as failing to move on a green light. |
Agent reaches the destination on time for at least 90% of trips. |
| B | Agent commits frequent minor traffic violations, such as failing to move on a green light. |
Agent reaches the destination on time for at least 80% of trips. |
| C | Agent commits at least one major traffic violation, such as driving through a red light. |
Agent reaches the destination on time for at least 70% of trips. |
| D | Agent causes at least one minor accident, such as turning left on green with oncoming traffic. |
Agent reaches the destination on time for at least 60% of trips. |
| F | Agent causes at least one major accident, such as driving through a red light with cross-traffic. |
Agent fails to reach the destination on time for at least 60% of trips. |
To assist evaluating these important metrics, you will need to load visualization code that will be used later on in the project. Run the code cell below to import this code which is required for your analysis.
# Import the visualization code This is agent.py
import random
import math
from smartcab.environment import Agent, Environment
from smartcab.planner import RoutePlanner
from smartcab.simulator import Simulator
class LearningAgent(Agent):
""" An agent that learns to drive in the Smartcab world.
This is the object you will be modifying. """
def __init__(self, env, learning=False, epsilon=1.0, alpha=0.5):
super(LearningAgent, self).__init__(env) # Set the agent in the evironment
self.planner = RoutePlanner(self.env, self) # Create a route planner
self.valid_actions = self.env.valid_actions # The set of valid actions
# Set parameters of the learning agent
self.learning = learning # Whether the agent is expected to learn
self.Q = dict() # Create a Q-table which will be a dictionary of tuples
self.epsilon = epsilon # Random exploration factor
self.alpha = alpha # Learning factor
###########
## TO DO ##
###########
# Set any additional class parameters as needed
self.trial_num = 0
self.prev_state = None
self.counter = 1
def reset(self, destination=None, testing=False):
""" The reset function is called at the beginning of each trial.
'testing' is set to True if testing trials are being used
once training trials have completed. """
# Select the destination as the new location to route to
self.planner.route_to(destination)
###########
## TO DO ##
###########
# Update epsilon using a decay function of your choice
# Update additional class parameters as needed
# If 'testing' is True, set epsilon and alpha to 0
if testing:
self.epsilon = 0
self.alpha = 0
else:
self.epsilon = math.exp(-.01*self.trial_num)
self.trial_num = self.trial_num +1
if self.alpha > .2:
self.alpha = self.alpha - 0.02
return None
def build_state(self):
""" The build_state function is called when the agent requests data from the
environment. The next waypoint, the intersection inputs, and the deadline
are all features available to the agent. """
# Collect data about the environment
waypoint = self.planner.next_waypoint() # The next waypoint
inputs = self.env.sense(self) # Visual input - intersection light and traffic
deadline = self.env.get_deadline(self) # Remaining deadline
###########
## TO DO ##
###########
# NOTE : you are not allowed to engineer eatures outside of the inputs available.
# Because the aim of this project is to teach Reinforcement Learning, we have placed
# constraints in order for you to learn how to adjust epsilon and alpha, and thus learn about the balance between exploration and exploitation.
# With the hand-engineered features, this learning process gets entirely negated.
# Set 'state' as a tuple of relevant data for the agent
#state = None
state = (waypoint, inputs['light'], inputs['oncoming'], inputs['left'])
self.prev_state = state
return state
def get_maxQ(self, state):
""" The get_max_Q function is called when the agent is asked to find the
maximum Q-value of all actions based on the 'state' the smartcab is in. """
###########
## TO DO ##
###########
# Calculate the maximum Q-value of all actions for a given state
maxQ = None
for key in self.Q[state]:
if (maxQ == None) or (maxQ < self.Q[state][key]):
maxQ = self.Q[state][key]
return maxQ
def createQ(self, state):
""" The createQ function is called when a state is generated by the agent. """
###########
## TO DO ##
###########
# When learning, check if the 'state' is not in the Q-table
# If it is not, create a new dictionary for that state
# Then, for each action available, set the initial Q-value to 0.0
if self.learning:
if not (state in self.Q.keys()):
self.Q[state] = {None: 0.0, 'forward': 0.0, 'left': 0.0, 'right': 0.0}
return
def choose_action(self, state):
""" The choose_action function is called when the agent is asked to choose
which action to take, based on the 'state' the smartcab is in. """
# Set the agent state and default action
self.state = state
self.next_waypoint = self.planner.next_waypoint()
action = None
###########
## TO DO ##
###########
# When not learning, choose a random action
# When learning, choose a random action with 'epsilon' probability
# Otherwise, choose an action with the highest Q-value for the current state
# Be sure that when choosing an action with highest Q-value that you randomly select between actions that "tie".
if self.learning:
if random.random() < self.epsilon:
action = random.choice(self.valid_actions)
else:
maxQ = self.get_maxQ(state)
argmax_actions = [k for k,v in self.Q[state].items() if v == maxQ]
action = random.choice(argmax_actions)
else:
action = random.choice(self.valid_actions)
return action
def learn(self, state, action, reward):
""" The learn function is called after the agent completes an action and
receives a reward. This function does not consider future rewards
when conducting learning. """
###########
## TO DO ##
###########
# When learning, implement the value iteration update rule
# Use only the learning rate 'alpha' (do not use the discount factor 'gamma')
if self.learning:
self.Q[self.state][action] = (1.0 - self.alpha)*self.Q[self.state][action] + (self.alpha)*(reward)
self.counter += 1
return
def update(self):
""" The update function is called when a time step is completed in the
environment for a given trial. This function will build the agent
state, choose an action, receive a reward, and learn if enabled. """
state = self.build_state() # Get current state
self.createQ(state) # Create 'state' in Q-table
action = self.choose_action(state) # Choose an action
reward = self.env.act(self, action) # Receive a reward
self.learn(state, action, reward) # Q-learn
return
def run():
""" Driving function for running the simulation.
Press ESC to close the simulation, or [SPACE] to pause the simulation. """
##############
# Create the environment
# Flags:
# verbose - set to True to display additional output from the simulation
# num_dummies - discrete number of dummy agents in the environment, default is 100
# grid_size - discrete number of intersections (columns, rows), default is (8, 6)
env = Environment(verbose = True)
##############
# Create the driving agent
# Flags:
# learning - set to True to force the driving agent to use Q-learning
# * epsilon - continuous value for the exploration factor, default is 1
# * alpha - continuous value for the learning rate, default is 0.5
agent = env.create_agent(LearningAgent, learning = True, epsilon = .95, alpha = .5)
##############
# Follow the driving agent
# Flags:
# enforce_deadline - set to True to enforce a deadline metric
env.set_primary_agent(agent, enforce_deadline = True)
##############
# Create the simulation
# Flags:
# update_delay - continuous time (in seconds) between actions, default is 2.0 seconds
# display - set to False to disable the GUI if PyGame is enabled
# log_metrics - set to True to log trial and simulation results to /logs
# optimized - set to True to change the default log file name
sim = Simulator(env, update_delay = 0.01, log_metrics =True, optimized = True)
##############
# Run the simulator
# Flags:
# tolerance - epsilon tolerance before beginning testing, default is 0.05
# n_test - discrete number of testing trials to perform, default is 0
sim.run(n_test = 10, tolerance = 0.015)
if __name__ == '__main__':
run()
/-------------------------
| Training trial 1
\-------------------------
Environment.reset(): Trial set up with start = (7, 4), destination = (3, 2), deadline = 30
Simulating trial. . .
epsilon = 1.0000; alpha = 0.4800
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: right, reward: 2.42214899313
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 2.4221489931304285, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.42)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: right, reward: 1.17233567413
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 29, 't': 1, 'action': 'right', 'reward': 1.17233567412838, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.17)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 1.66097586225
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 28, 't': 2, 'action': 'forward', 'reward': 1.6609758622540458, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 1.66)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: right, reward: 0.364540635097
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 27, 't': 3, 'action': 'right', 'reward': 0.36454063509670076, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.36)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: forward, reward: -10.1215622119
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': -10.121562211913544, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent attempted driving forward through a red light. (rewarded -10.12)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: forward, reward: 1.93508081814
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 1.9350808181363846, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 1.94)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: None, reward: -5.46124783669
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 24, 't': 6, 'action': None, 'reward': -5.461247836691815, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.46)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: None, reward: 1.94621579674
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 23, 't': 7, 'action': None, 'reward': 1.9462157967433247, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.95)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: left, reward: -9.29210416335
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 22, 't': 8, 'action': 'left', 'reward': -9.292104163346231, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.29)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: right, reward: 0.996715214397
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 21, 't': 9, 'action': 'right', 'reward': 0.9967152143970635, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove right instead of left. (rewarded 1.00)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: left, reward: -10.8088648277
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 10, 'action': 'left', 'reward': -10.808864827676574, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.81)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: left, reward: -40.5671801887
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'right', 'forward'), 'deadline': 19, 't': 11, 'action': 'left', 'reward': -40.56718018873804, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.57)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: -9.36096177025
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 12, 'action': 'forward', 'reward': -9.3609617702549, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.36)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 1.60087155221
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 1.6008715522146433, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 1.60)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: None, reward: 1.46307921482
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 16, 't': 14, 'action': None, 'reward': 1.463079214822594, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.46)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: left, reward: -10.523654608
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 15, 't': 15, 'action': 'left', 'reward': -10.523654607991682, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -10.52)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: None, reward: -4.01438313421
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 14, 't': 16, 'action': None, 'reward': -4.014383134211511, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.01)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: left, reward: 2.28896124112
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 13, 't': 17, 'action': 'left', 'reward': 2.2889612411200893, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.29)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: None, reward: 1.64892587356
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 12, 't': 18, 'action': None, 'reward': 1.6489258735621375, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.65)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: left, reward: -9.40056890826
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 11, 't': 19, 'action': 'left', 'reward': -9.400568908259519, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -9.40)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: right, reward: 0.411156178047
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 10, 't': 20, 'action': 'right', 'reward': 0.4111561780470496, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent drove right instead of left. (rewarded 0.41)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: right, reward: 0.912566904929
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 9, 't': 21, 'action': 'right', 'reward': 0.9125669049288968, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.91)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: forward, reward: -40.2420278736
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 8, 't': 22, 'action': 'forward', 'reward': -40.24202787360673, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.24)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: right, reward: -0.416253454792
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 7, 't': 23, 'action': 'right', 'reward': -0.4162534547917488, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove right instead of left. (rewarded -0.42)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 0.901773695825
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 6, 't': 24, 'action': None, 'reward': 0.9017736958254265, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.90)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: left, reward: -40.0738446711
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 5, 't': 25, 'action': 'left', 'reward': -40.073844671131816, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.07)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: right, reward: 0.529810904079
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 4, 't': 26, 'action': 'right', 'reward': 0.5298109040792487, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.53)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: left, reward: -40.7054527193
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 3, 't': 27, 'action': 'left', 'reward': -40.70545271934346, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.71)
7% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: forward, reward: -9.29590516058
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 2, 't': 28, 'action': 'forward', 'reward': -9.295905160576794, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.30)
3% of time remaining to reach destination.
/-------------------
| Step 29 Results
\-------------------
Environment.step(): t = 29
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: left, reward: -10.2907286725
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 1, 't': 29, 'action': 'left', 'reward': -10.29072867248367, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.29)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 2
\-------------------------
Environment.reset(): Trial set up with start = (2, 3), destination = (5, 7), deadline = 25
Simulating trial. . .
epsilon = 0.9900; alpha = 0.4600
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: right, reward: -20.4295393915
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': -20.429539391455837, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.43)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: right, reward: 0.595214937382
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 0.5952149373815339, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent drove right instead of left. (rewarded 0.60)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: right, reward: 0.462976566057
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 0.4629765660574032, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove right instead of forward. (rewarded 0.46)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: left, reward: -10.5911728626
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 22, 't': 3, 'action': 'left', 'reward': -10.591172862628078, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -10.59)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: left, reward: -40.5243955606
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 21, 't': 4, 'action': 'left', 'reward': -40.52439556059413, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.52)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: forward, reward: 1.04774311513
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.0477431151322847, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 1.05)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: forward, reward: -10.9811531498
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': -10.981153149813876, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.98)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: right, reward: 0.701461654855
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 0.7014616548549955, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.70)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: right, reward: 1.856327366
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 1.8563273660015676, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent drove right instead of forward. (rewarded 1.86)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: forward, reward: 0.918219000149
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 0.9182190001486776, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 0.92)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: forward, reward: 1.03753440898
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'forward'), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 1.0375344089839853, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'forward')
Agent drove forward instead of left. (rewarded 1.04)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: forward, reward: -0.0870690609906
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'right'), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': -0.08706906099057521, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'right')
Agent drove forward instead of left. (rewarded -0.09)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: left, reward: 1.39473081771
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 13, 't': 12, 'action': 'left', 'reward': 1.3947308177074065, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 1.39)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 0.916536616575
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 12, 't': 13, 'action': None, 'reward': 0.9165366165747242, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.92)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: left, reward: -9.31556408406
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 11, 't': 14, 'action': 'left', 'reward': -9.315564084055007, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.32)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 2.53658355838
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 10, 't': 15, 'action': None, 'reward': 2.536583558382133, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.54)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: right, reward: 0.457272113867
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 9, 't': 16, 'action': 'right', 'reward': 0.4572721138670933, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent drove right instead of forward. (rewarded 0.46)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: right, reward: -19.2552907954
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 8, 't': 17, 'action': 'right', 'reward': -19.255290795403276, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.26)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: left, reward: -9.79722129676
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 7, 't': 18, 'action': 'left', 'reward': -9.79722129676497, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.80)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: forward, reward: 1.29754012828
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 6, 't': 19, 'action': 'forward', 'reward': 1.2975401282807808, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 1.30)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: None, reward: -4.41111415969
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 5, 't': 20, 'action': None, 'reward': -4.411114159688502, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.41)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: right, reward: 1.19545370004
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 4, 't': 21, 'action': 'right', 'reward': 1.1954537000391192, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 1.20)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: None, reward: 0.457048071428
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 3, 't': 22, 'action': None, 'reward': 0.45704807142790926, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 0.46)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: forward, reward: -10.6805050515
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 2, 't': 23, 'action': 'forward', 'reward': -10.68050505151798, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.68)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: None, reward: -0.703637504781
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 1, 't': 24, 'action': None, 'reward': -0.7036375047808776, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded -0.70)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 3
\-------------------------
Environment.reset(): Trial set up with start = (4, 5), destination = (1, 4), deadline = 20
Simulating trial. . .
epsilon = 0.9802; alpha = 0.4400
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: None, reward: 0.0986521538402
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 20, 't': 0, 'action': None, 'reward': 0.09865215384017101, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.10)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: left, reward: -10.936121194
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': 'left', 'reward': -10.936121194025215, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -10.94)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 5), heading: (-1, 0), action: right, reward: 2.17990737152
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.179907371520785, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.18)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 5), heading: (-1, 0), action: None, reward: 1.316950792
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.316950792004547, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.32)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: forward, reward: 2.07295902293
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.07295902292606, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 2.07)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: left, reward: -10.2305809932
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': -10.230580993187779, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.23)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 1.20285140778
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.2028514077765724, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 1.20)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: forward, reward: 2.07666842275
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.0766684227548553, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.08)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: left, reward: 1.68427868298
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 12, 't': 8, 'action': 'left', 'reward': 1.6842786829844894, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent drove left instead of right. (rewarded 1.68)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: left, reward: -9.06764039649
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 11, 't': 9, 'action': 'left', 'reward': -9.067640396488066, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.07)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: None, reward: -4.90239754765
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 10, 't': 10, 'action': None, 'reward': -4.902397547648182, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.90)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: None, reward: -4.92808921604
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 9, 't': 11, 'action': None, 'reward': -4.928089216036499, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.93)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: forward, reward: 0.89600928323
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': 0.8960092832302384, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent drove forward instead of right. (rewarded 0.90)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: -20.2586104822
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 7, 't': 13, 'action': 'right', 'reward': -20.25861048216734, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.26)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: -19.9420528398
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 6, 't': 14, 'action': 'right', 'reward': -19.94205283975433, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.94)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: forward, reward: 2.05163193934
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': 2.0516319393357456, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.05)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: right, reward: -0.047711019135
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 4, 't': 16, 'action': 'right', 'reward': -0.04771101913501141, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded -0.05)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: right, reward: 0.958959399235
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 3, 't': 17, 'action': 'right', 'reward': 0.9589593992345368, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 0.96)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: right, reward: 1.90959876575
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 2, 't': 18, 'action': 'right', 'reward': 1.9095987657489746, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.91)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: right, reward: 0.628882614597
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 1, 't': 19, 'action': 'right', 'reward': 0.6288826145966366, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 0.63)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 4
\-------------------------
Environment.reset(): Trial set up with start = (2, 4), destination = (5, 6), deadline = 25
Simulating trial. . .
epsilon = 0.9704; alpha = 0.4200
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: forward, reward: 1.41593315175
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 1.415933151752064, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.42)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: -5.99473693992
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 24, 't': 1, 'action': None, 'reward': -5.994736939924223, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.99)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: right, reward: 1.56995694279
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.5699569427853615, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove right instead of forward. (rewarded 1.57)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: right, reward: 0.759376349351
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'right', 'reward': 0.7593763493513428, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.76)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 2.50623224811
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.5062322481051877, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.51)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: forward, reward: -40.1638061718
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': -40.16380617182168, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.16)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: right, reward: 0.791630096293
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 0.791630096292865, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent drove right instead of left. (rewarded 0.79)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: None, reward: 0.846676829069
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 7, 'action': None, 'reward': 0.846676829068574, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.85)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: None, reward: 0.323777276264
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 0.32377727626437125, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.32)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: right, reward: 2.66692478714
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 2.6669247871394663, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.67)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: forward, reward: 1.79894649122
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 1.7989464912242703, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.80)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: forward, reward: 0.88665014968
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 0.8866501496795849, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 0.89)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: right, reward: 0.931754854098
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'left'), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 0.9317548540983152, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'left')
Agent followed the waypoint right. (rewarded 0.93)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: forward, reward: -9.86503923938
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': -9.86503923937533, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.87)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: None, reward: 1.99287744175
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 11, 't': 14, 'action': None, 'reward': 1.9928774417502357, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.99)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: left, reward: 1.31808382527
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 10, 't': 15, 'action': 'left', 'reward': 1.318083825268646, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.32)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: forward, reward: 1.36532783628
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 9, 't': 16, 'action': 'forward', 'reward': 1.3653278362755241, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent drove forward instead of right. (rewarded 1.37)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: -5.5017677976
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 8, 't': 17, 'action': None, 'reward': -5.501767797595875, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.50)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 4), heading: (0, -1), action: left, reward: 0.465343626355
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 7, 't': 18, 'action': 'left', 'reward': 0.46534362635473614, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 0.47)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: forward, reward: -0.421492819282
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 6, 't': 19, 'action': 'forward', 'reward': -0.42149281928155535, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded -0.42)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: left, reward: 1.28211356183
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 5, 't': 20, 'action': 'left', 'reward': 1.2821135618333954, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.28)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: forward, reward: -9.10734200056
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 4, 't': 21, 'action': 'forward', 'reward': -9.107342000559965, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent attempted driving forward through a red light. (rewarded -9.11)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: left, reward: -10.5785138739
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 3, 't': 22, 'action': 'left', 'reward': -10.578513873850135, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.58)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: None, reward: -5.62978857372
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 2, 't': 23, 'action': None, 'reward': -5.62978857371648, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.63)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: right, reward: -0.842752167802
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 1, 't': 24, 'action': 'right', 'reward': -0.8427521678017247, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'green', 'right', None)
Agent drove right instead of forward. (rewarded -0.84)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 5
\-------------------------
Environment.reset(): Trial set up with start = (6, 3), destination = (3, 2), deadline = 20
Simulating trial. . .
epsilon = 0.9608; alpha = 0.4000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: None, reward: -4.03486548534
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 20, 't': 0, 'action': None, 'reward': -4.034865485335335, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.03)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: right, reward: 2.512383615
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 2.5123836149950276, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.51)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: right, reward: 0.701275366531
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 0.701275366531417, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.70)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: left, reward: -9.29883941386
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': -9.298839413855813, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -9.30)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: left, reward: 2.14100753759
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.1410075375937665, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 2.14)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: forward, reward: -39.3110335833
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': -39.31103358331567, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.31)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: forward, reward: -9.24096847748
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': -9.240968477484016, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -9.24)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: -4.68549574483
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': None, 'reward': -4.6854957448286765, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.69)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: left, reward: 1.82210549151
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 12, 't': 8, 'action': 'left', 'reward': 1.8221054915069321, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 1.82)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: right, reward: 2.08217446305
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 2.0821744630487924, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 2.08)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: 1.05137474029
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.0513747402875837, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.05)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: forward, reward: 0.759270280397
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 0.7592702803969866, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 0.76)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: forward, reward: -10.8795778394
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': -10.879577839397152, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent attempted driving forward through a red light. (rewarded -10.88)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: right, reward: 1.16417887597
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 7, 't': 13, 'action': 'right', 'reward': 1.1641788759657035, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.16)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: right, reward: 0.903728031154
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 6, 't': 14, 'action': 'right', 'reward': 0.9037280311538245, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 0.90)
25% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 6
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (4, 5), deadline = 20
Simulating trial. . .
epsilon = 0.9512; alpha = 0.3800
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: left, reward: 0.980150344476
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 0.9801503444759658, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent drove left instead of forward. (rewarded 0.98)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: left, reward: -19.3091623097
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'green', 'state': ('right', 'green', 'forward', 'forward'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': -19.3091623096845, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'forward')
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.31)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: right, reward: 1.69225877077
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.6922587707735974, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.69)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: right, reward: 0.31428032469
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 0.31428032468993405, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.31)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: forward, reward: 1.48480402479
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.4848040247865382, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove forward instead of left. (rewarded 1.48)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: None, reward: -5.99895194492
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 15, 't': 5, 'action': None, 'reward': -5.998951944922902, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -6.00)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: right, reward: 0.422256835437
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 0.422256835436577, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.42)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: left, reward: -10.7643810654
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': -10.764381065382683, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.76)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: 0.934987436429
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': 0.9349874364292586, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.93)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: -4.75013636992
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'left', 'right'), 'deadline': 11, 't': 9, 'action': None, 'reward': -4.75013636992198, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -4.75)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: right, reward: 1.72543504031
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 1.7254350403064367, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.73)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: right, reward: 2.60513627079
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 2.60513627078762, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.61)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 1.49090506217
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 8, 't': 12, 'action': None, 'reward': 1.4909050621666742, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.49)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: left, reward: -20.162255432
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 7, 't': 13, 'action': 'left', 'reward': -20.162255431982043, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.16)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: -5.35004263771
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 6, 't': 14, 'action': None, 'reward': -5.350042637714778, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.35)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: 2.24069349839
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': 2.240693498392033, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.24)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: -10.3290096864
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 4, 't': 16, 'action': 'forward', 'reward': -10.329009686420568, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent attempted driving forward through a red light. (rewarded -10.33)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: 1.53614702288
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 3, 't': 17, 'action': 'forward', 'reward': 1.5361470228823695, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.54)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: right, reward: -0.527745218614
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 2, 't': 18, 'action': 'right', 'reward': -0.527745218613765, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent drove right instead of left. (rewarded -0.53)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: forward, reward: -0.2369987906
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 1, 't': 19, 'action': 'forward', 'reward': -0.2369987905996368, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded -0.24)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 7
\-------------------------
Environment.reset(): Trial set up with start = (2, 6), destination = (6, 2), deadline = 30
Simulating trial. . .
epsilon = 0.9418; alpha = 0.3600
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: forward, reward: -10.5840870883
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 30, 't': 0, 'action': 'forward', 'reward': -10.584087088345255, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent attempted driving forward through a red light. (rewarded -10.58)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: forward, reward: -9.03609309916
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 29, 't': 1, 'action': 'forward', 'reward': -9.036093099155883, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent attempted driving forward through a red light. (rewarded -9.04)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 2.90542767331
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.905427673306809, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.91)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 1.40746012556
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.4074601255569594, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.41)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: left, reward: -9.02357047354
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 26, 't': 4, 'action': 'left', 'reward': -9.023570473544842, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.02)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: right, reward: 0.31812728431
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'right', 'reward': 0.3181272843097728, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.32)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: left, reward: 1.02023061478
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 24, 't': 6, 'action': 'left', 'reward': 1.0202306147825557, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.02)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: forward, reward: -40.7188064176
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': -40.71880641761277, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.72)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: right, reward: -0.0342157233756
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 8, 'action': 'right', 'reward': -0.03421572337559031, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded -0.03)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: forward, reward: -9.29060756049
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': -9.290607560486068, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent attempted driving forward through a red light. (rewarded -9.29)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: forward, reward: -10.1698006003
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': -10.16980060029582, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent attempted driving forward through a red light. (rewarded -10.17)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: right, reward: 1.37596808219
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 1.3759680821949067, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove right instead of left. (rewarded 1.38)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 1.52710649542
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 18, 't': 12, 'action': None, 'reward': 1.527106495415467, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.53)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: left, reward: 1.10308185909
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 13, 'action': 'left', 'reward': 1.1030818590902955, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.10)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: forward, reward: -10.2316528203
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 14, 'action': 'forward', 'reward': -10.231652820256063, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.23)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: None, reward: 2.29400755203
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 15, 'action': None, 'reward': 2.2940075520316916, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.29)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: left, reward: 2.21348558429
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 14, 't': 16, 'action': 'left', 'reward': 2.2134855842929193, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 2.21)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: right, reward: 1.04170603669
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 17, 'action': 'right', 'reward': 1.0417060366894724, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 1.04)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: right, reward: 1.55378659193
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 12, 't': 18, 'action': 'right', 'reward': 1.553786591925978, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 1.55)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: None, reward: -0.0349222238145
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 11, 't': 19, 'action': None, 'reward': -0.034922223814536024, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent properly idled at a red light. (rewarded -0.03)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: right, reward: 0.785639594115
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'right'), 'deadline': 10, 't': 20, 'action': 'right', 'reward': 0.785639594115245, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'right')
Agent followed the waypoint right. (rewarded 0.79)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: right, reward: -19.0493832501
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('right', 'red', 'forward', 'forward'), 'deadline': 9, 't': 21, 'action': 'right', 'reward': -19.049383250054785, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.05)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: forward, reward: -9.87771442043
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 8, 't': 22, 'action': 'forward', 'reward': -9.877714420434186, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.88)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: None, reward: -0.40262126241
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'right'), 'deadline': 7, 't': 23, 'action': None, 'reward': -0.4026212624098019, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded -0.40)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: forward, reward: -9.06235230585
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', 'right'), 'deadline': 6, 't': 24, 'action': 'forward', 'reward': -9.062352305854226, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'right')
Agent attempted driving forward through a red light. (rewarded -9.06)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: left, reward: -10.8114839633
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 5, 't': 25, 'action': 'left', 'reward': -10.811483963336665, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.81)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: right, reward: 0.410810187846
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 4, 't': 26, 'action': 'right', 'reward': 0.4108101878458328, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 0.41)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: right, reward: 0.892853487007
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 3, 't': 27, 'action': 'right', 'reward': 0.8928534870071645, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent drove right instead of forward. (rewarded 0.89)
7% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: left, reward: 1.6560037265
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 2, 't': 28, 'action': 'left', 'reward': 1.656003726496589, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.66)
3% of time remaining to reach destination.
/-------------------
| Step 29 Results
\-------------------
Environment.step(): t = 29
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: left, reward: -9.04122326086
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 1, 't': 29, 'action': 'left', 'reward': -9.041223260856665, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.04)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 8
\-------------------------
Environment.reset(): Trial set up with start = (5, 2), destination = (3, 4), deadline = 20
Simulating trial. . .
epsilon = 0.9324; alpha = 0.3400
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: None, reward: -5.21633876189
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': -5.216338761886792, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.22)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: right, reward: 0.325004787922
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 0.32500478792207643, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 0.33)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: forward, reward: 0.602471659764
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'left'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 0.6024716597638121, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'left')
Agent drove forward instead of right. (rewarded 0.60)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: None, reward: 1.37737291421
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.377372914209407, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.38)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: left, reward: -9.30414226462
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': -9.304142264616006, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.30)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: left, reward: 1.71442021186
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 1.714420211856613, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.71)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: None, reward: 0.0473739819788
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 0.04737398197875864, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.05)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: left, reward: 1.49656099819
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 1.496560998188283, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 1.50)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: 2.82167010112
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.8216701011166805, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.82)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: -4.35059156618
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 11, 't': 9, 'action': None, 'reward': -4.35059156618303, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -4.35)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: -5.54665065913
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 10, 't': 10, 'action': None, 'reward': -5.5466506591272, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.55)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: -9.91604458055
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': -9.916044580548776, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.92)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: right, reward: 1.10074209848
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 1.100742098483377, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 1.10)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: forward, reward: 0.476519358241
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 0.476519358240863, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 0.48)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: forward, reward: 0.0199557334819
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 6, 't': 14, 'action': 'forward', 'reward': 0.019955733481937243, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove forward instead of left. (rewarded 0.02)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: left, reward: 1.89967523069
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 5, 't': 15, 'action': 'left', 'reward': 1.899675230692627, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.90)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: right, reward: 0.631843593216
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 4, 't': 16, 'action': 'right', 'reward': 0.6318435932163313, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent drove right instead of forward. (rewarded 0.63)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: left, reward: -9.27725105848
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 3, 't': 17, 'action': 'left', 'reward': -9.277251058475976, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.28)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: left, reward: -9.59167527413
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 2, 't': 18, 'action': 'left', 'reward': -9.591675274134055, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent attempted driving left through a red light. (rewarded -9.59)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: forward, reward: -10.2919809408
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 1, 't': 19, 'action': 'forward', 'reward': -10.291980940848735, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -10.29)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 9
\-------------------------
Environment.reset(): Trial set up with start = (8, 5), destination = (5, 7), deadline = 25
Simulating trial. . .
epsilon = 0.9231; alpha = 0.3200
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: left, reward: 1.70970268101
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 1.7097026810125464, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 1.71)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: -5.79292455517
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': -5.792924555172394, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.79)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: -5.5284524497
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': -5.528452449701424, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.53)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: right, reward: 0.383904798028
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'right', 'reward': 0.3839047980279775, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.38)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: forward, reward: 0.773078014594
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 0.7730780145938226, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 0.77)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: right, reward: 1.66937692607
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 1.6693769260708353, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 1.67)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: forward, reward: -9.6495653633
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'right', 'right'), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': -9.649565363295213, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'right')
Agent attempted driving forward through a red light. (rewarded -9.65)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: left, reward: -20.7112389032
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': -20.711238903167242, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.71)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: -4.81318762465
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': -4.813187624653816, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.81)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: left, reward: -19.5307051026
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 16, 't': 9, 'action': 'left', 'reward': -19.53070510264811, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.53)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: right, reward: 1.26527468615
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 1.2652746861495694, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent drove right instead of left. (rewarded 1.27)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: right, reward: 0.804457226204
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 0.8044572262036195, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 0.80)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: left, reward: -9.69947272894
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 13, 't': 12, 'action': 'left', 'reward': -9.699472728943254, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent attempted driving left through a red light. (rewarded -9.70)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: left, reward: -20.7976804136
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 12, 't': 13, 'action': 'left', 'reward': -20.797680413582775, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.80)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: None, reward: -5.78363160087
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 11, 't': 14, 'action': None, 'reward': -5.783631600873496, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.78)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: right, reward: -20.73851186
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 10, 't': 15, 'action': 'right', 'reward': -20.73851186002524, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.74)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: None, reward: 2.43705209161
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 9, 't': 16, 'action': None, 'reward': 2.43705209161328, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.44)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: right, reward: 0.175206878455
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 8, 't': 17, 'action': 'right', 'reward': 0.17520687845509597, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.18)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: right, reward: -0.224043786387
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 7, 't': 18, 'action': 'right', 'reward': -0.2240437863870548, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded -0.22)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: left, reward: 0.62136140866
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 6, 't': 19, 'action': 'left', 'reward': 0.6213614086604902, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 0.62)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: right, reward: 0.734437472799
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 5, 't': 20, 'action': 'right', 'reward': 0.7344374727990473, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.73)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: forward, reward: -9.73577583196
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 4, 't': 21, 'action': 'forward', 'reward': -9.735775831961494, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.74)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: forward, reward: -9.55061925233
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 3, 't': 22, 'action': 'forward', 'reward': -9.55061925232777, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.55)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: None, reward: -5.43694780395
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 2, 't': 23, 'action': None, 'reward': -5.436947803952621, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.44)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: left, reward: 1.57372383361
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 1, 't': 24, 'action': 'left', 'reward': 1.573723833606026, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.57)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 10
\-------------------------
Environment.reset(): Trial set up with start = (1, 5), destination = (6, 2), deadline = 30
Simulating trial. . .
epsilon = 0.9139; alpha = 0.3000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: right, reward: 2.48548116587
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 2.485481165867736, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 2.49)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: left, reward: 1.66926170453
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 29, 't': 1, 'action': 'left', 'reward': 1.6692617045313467, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 1.67)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: -4.69234595411
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': -4.692345954110756, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.69)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: -5.5727100353
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 27, 't': 3, 'action': None, 'reward': -5.572710035299056, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -5.57)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: left, reward: 0.518959517904
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 26, 't': 4, 'action': 'left', 'reward': 0.5189595179043344, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 0.52)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: None, reward: 1.60831346653
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 25, 't': 5, 'action': None, 'reward': 1.6083134665301624, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.61)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: forward, reward: -10.4058113867
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': -10.405811386730043, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.41)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: left, reward: 1.64774004034
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 23, 't': 7, 'action': 'left', 'reward': 1.6477400403381661, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.65)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 2.40062953322
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 8, 'action': 'forward', 'reward': 2.4006295332172543, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.40)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: right, reward: 1.7814158162
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 9, 'action': 'right', 'reward': 1.7814158161959408, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 1.78)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: left, reward: -10.8881861385
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 10, 'action': 'left', 'reward': -10.888186138479973, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.89)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: left, reward: -9.8121283878
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 11, 'action': 'left', 'reward': -9.812128387804291, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.81)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: 1.61518355905
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 12, 'action': None, 'reward': 1.6151835590485675, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.62)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: left, reward: 1.29424524902
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 17, 't': 13, 'action': 'left', 'reward': 1.294245249017883, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.29)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: right, reward: 1.48904594263
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 16, 't': 14, 'action': 'right', 'reward': 1.4890459426329823, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 1.49)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: right, reward: 0.337163931959
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 15, 'action': 'right', 'reward': 0.33716393195904215, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 0.34)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: -4.94087497072
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 14, 't': 16, 'action': None, 'reward': -4.9408749707201, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.94)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: right, reward: -20.8762668008
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 13, 't': 17, 'action': 'right', 'reward': -20.876266800778588, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.88)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 2.43232051817
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 12, 't': 18, 'action': None, 'reward': 2.4323205181729373, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.43)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: right, reward: 1.10565438843
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 11, 't': 19, 'action': 'right', 'reward': 1.105654388431069, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent drove right instead of left. (rewarded 1.11)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: left, reward: 0.793343286262
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 10, 't': 20, 'action': 'left', 'reward': 0.793343286262491, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent drove left instead of right. (rewarded 0.79)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: forward, reward: -40.0070746931
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 9, 't': 21, 'action': 'forward', 'reward': -40.00707469314189, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.01)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: right, reward: -0.116908611758
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 8, 't': 22, 'action': 'right', 'reward': -0.1169086117576641, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded -0.12)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: left, reward: -10.997505876
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 7, 't': 23, 'action': 'left', 'reward': -10.997505876030766, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -11.00)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: forward, reward: -10.290189166
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 6, 't': 24, 'action': 'forward', 'reward': -10.290189166034015, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.29)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: forward, reward: 0.245355404144
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 5, 't': 25, 'action': 'forward', 'reward': 0.24535540414422075, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent drove forward instead of right. (rewarded 0.25)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: forward, reward: 1.10641020444
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 4, 't': 26, 'action': 'forward', 'reward': 1.1064102044370276, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent drove forward instead of right. (rewarded 1.11)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: left, reward: -0.146219952769
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 3, 't': 27, 'action': 'left', 'reward': -0.1462199527685264, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded -0.15)
7% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: left, reward: 0.327368657789
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 2, 't': 28, 'action': 'left', 'reward': 0.3273686577893248, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent drove left instead of right. (rewarded 0.33)
3% of time remaining to reach destination.
/-------------------
| Step 29 Results
\-------------------
Environment.step(): t = 29
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: right, reward: 0.642230886975
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 1, 't': 29, 'action': 'right', 'reward': 0.6422308869750524, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'green', 'right', None)
Agent drove right instead of left. (rewarded 0.64)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 11
\-------------------------
Environment.reset(): Trial set up with start = (3, 4), destination = (7, 5), deadline = 25
Simulating trial. . .
epsilon = 0.9048; alpha = 0.2800
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: forward, reward: -40.7912514008
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': -40.79125140084329, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.79)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 1.8474973938
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.8474973937961228, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.85)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: right, reward: 2.2408342688
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 2.24083426880391, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.24)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: right, reward: 1.85407165919
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'right', 'reward': 1.8540716591940032, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.85)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: forward, reward: -10.9251810845
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'right', 'left'), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': -10.925181084461189, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'left')
Agent attempted driving forward through a red light. (rewarded -10.93)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: left, reward: -9.79313910805
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': -9.793139108051763, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.79)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: forward, reward: -9.46618703587
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': -9.466187035866755, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -9.47)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: right, reward: 0.880118170511
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 0.8801181705107326, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 0.88)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: None, reward: 2.06841966068
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.0684196606783116, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.07)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: left, reward: -9.00656348464
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 16, 't': 9, 'action': 'left', 'reward': -9.006563484641365, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -9.01)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: left, reward: 2.73449440951
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 15, 't': 10, 'action': 'left', 'reward': 2.7344944095130996, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.73)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: -4.61341696811
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 14, 't': 11, 'action': None, 'reward': -4.6134169681144765, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.61)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: forward, reward: 2.68770542005
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 12, 'action': 'forward', 'reward': 2.6877054200518042, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.69)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: None, reward: -5.97492101648
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 13, 'action': None, 'reward': -5.974921016478021, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.97)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: forward, reward: 1.26823652319
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 11, 't': 14, 'action': 'forward', 'reward': 1.2682365231919, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.27)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: right, reward: 1.1247166914
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 10, 't': 15, 'action': 'right', 'reward': 1.124716691396475, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent drove right instead of left. (rewarded 1.12)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: forward, reward: -9.05198956511
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 9, 't': 16, 'action': 'forward', 'reward': -9.051989565112692, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.05)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: None, reward: -4.83654724632
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 8, 't': 17, 'action': None, 'reward': -4.83654724632194, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.84)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: forward, reward: 0.707761999999
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'forward'), 'deadline': 7, 't': 18, 'action': 'forward', 'reward': 0.7077619999986446, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'forward')
Agent drove forward instead of right. (rewarded 0.71)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: left, reward: -39.7070755847
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 6, 't': 19, 'action': 'left', 'reward': -39.70707558473355, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.71)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: forward, reward: -39.908843534
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 5, 't': 20, 'action': 'forward', 'reward': -39.90884353398332, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.91)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: None, reward: -4.18280137959
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 4, 't': 21, 'action': None, 'reward': -4.182801379587781, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -4.18)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: forward, reward: 0.972358097854
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 3, 't': 22, 'action': 'forward', 'reward': 0.9723580978536197, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 0.97)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: forward, reward: 1.61399510064
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 2, 't': 23, 'action': 'forward', 'reward': 1.613995100636671, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.61)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: None, reward: -5.07877610118
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 1, 't': 24, 'action': None, 'reward': -5.078776101182443, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.08)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 12
\-------------------------
Environment.reset(): Trial set up with start = (5, 2), destination = (1, 4), deadline = 30
Simulating trial. . .
epsilon = 0.8958; alpha = 0.2600
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: left, reward: -9.80759003071
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', 'right'), 'deadline': 30, 't': 0, 'action': 'left', 'reward': -9.807590030706573, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'right')
Agent attempted driving left through a red light. (rewarded -9.81)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: forward, reward: -9.17981718161
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 29, 't': 1, 'action': 'forward', 'reward': -9.179817181607746, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.18)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: right, reward: -19.9320367189
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 28, 't': 2, 'action': 'right', 'reward': -19.932036718875374, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.93)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: left, reward: -10.2322047772
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 27, 't': 3, 'action': 'left', 'reward': -10.23220477715837, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.23)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: None, reward: -4.85644218167
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 26, 't': 4, 'action': None, 'reward': -4.856442181668075, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.86)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: None, reward: -5.7542098058
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 25, 't': 5, 'action': None, 'reward': -5.754209805801269, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.75)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: forward, reward: 1.5863809275
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.5863809275004215, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 1.59)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: forward, reward: -40.6529509386
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': -40.652950938580254, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.65)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: right, reward: 0.509425129916
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 22, 't': 8, 'action': 'right', 'reward': 0.5094251299161946, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent drove right instead of left. (rewarded 0.51)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: left, reward: 0.965553902003
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 9, 'action': 'left', 'reward': 0.9655539020030133, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 0.97)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: left, reward: 1.17228922921
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 20, 't': 10, 'action': 'left', 'reward': 1.172289229213194, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 1.17)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: left, reward: -19.3249644511
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 19, 't': 11, 'action': 'left', 'reward': -19.324964451149388, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.32)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: None, reward: -5.3679085841
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 12, 'action': None, 'reward': -5.367908584099095, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.37)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: right, reward: 1.62300612477
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 13, 'action': 'right', 'reward': 1.6230061247727323, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 1.62)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: right, reward: 1.58669253041
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 14, 'action': 'right', 'reward': 1.5866925304063233, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 1.59)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 4), heading: (0, -1), action: right, reward: 0.918075112602
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 15, 'action': 'right', 'reward': 0.9180751126017704, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.92)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: right, reward: 1.30917463174
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 14, 't': 16, 'action': 'right', 'reward': 1.3091746317421171, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.31)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: left, reward: -10.0669532175
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 13, 't': 17, 'action': 'left', 'reward': -10.066953217499318, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent attempted driving left through a red light. (rewarded -10.07)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: left, reward: -10.7794556046
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 12, 't': 18, 'action': 'left', 'reward': -10.779455604568076, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent attempted driving left through a red light. (rewarded -10.78)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: right, reward: -0.254771530273
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 11, 't': 19, 'action': 'right', 'reward': -0.2547715302729584, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded -0.25)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: left, reward: -9.79029996645
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 10, 't': 20, 'action': 'left', 'reward': -9.790299966450974, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.79)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: None, reward: -4.97821870802
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 9, 't': 21, 'action': None, 'reward': -4.978218708020368, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.98)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: right, reward: -0.416305724769
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 8, 't': 22, 'action': 'right', 'reward': -0.4163057247693607, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded -0.42)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: None, reward: -5.42474284882
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 7, 't': 23, 'action': None, 'reward': -5.42474284881849, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.42)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: left, reward: 0.165090017832
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 6, 't': 24, 'action': 'left', 'reward': 0.16509001783178423, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove left instead of forward. (rewarded 0.17)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: forward, reward: -40.7709840114
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 5, 't': 25, 'action': 'forward', 'reward': -40.770984011367034, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.77)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: forward, reward: -9.9085213511
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 4, 't': 26, 'action': 'forward', 'reward': -9.908521351104456, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent attempted driving forward through a red light. (rewarded -9.91)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: forward, reward: -9.86273632778
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 3, 't': 27, 'action': 'forward', 'reward': -9.862736327783784, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent attempted driving forward through a red light. (rewarded -9.86)
7% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: forward, reward: -0.667547407531
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 2, 't': 28, 'action': 'forward', 'reward': -0.6675474075313165, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent drove forward instead of right. (rewarded -0.67)
3% of time remaining to reach destination.
/-------------------
| Step 29 Results
\-------------------
Environment.step(): t = 29
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: right, reward: -19.5614236908
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 1, 't': 29, 'action': 'right', 'reward': -19.56142369079384, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.56)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 13
\-------------------------
Environment.reset(): Trial set up with start = (7, 2), destination = (3, 3), deadline = 25
Simulating trial. . .
epsilon = 0.8869; alpha = 0.2400
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: right, reward: 1.03622120068
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.0362212006846134, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 1.04)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: right, reward: 0.943997353921
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 0.9439973539211535, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent drove right instead of forward. (rewarded 0.94)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: right, reward: 0.879670222773
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 0.8796702227725077, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove right instead of left. (rewarded 0.88)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: forward, reward: -9.02556954177
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': -9.025569541769148, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.03)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: None, reward: -5.49101961857
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 21, 't': 4, 'action': None, 'reward': -5.491019618567304, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.49)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: None, reward: -5.61477571096
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 20, 't': 5, 'action': None, 'reward': -5.6147757109571135, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.61)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: forward, reward: 0.928232528039
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 0.9282325280393127, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 0.93)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: 1.75895902735
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.75895902734911, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.76)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 1.79677563712
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.7967756371169432, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.80)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: -4.77450283728
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 9, 'action': None, 'reward': -4.774502837282671, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.77)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: -4.82305018479
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 10, 'action': None, 'reward': -4.823050184794194, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.82)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: left, reward: -39.5643337015
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 14, 't': 11, 'action': 'left', 'reward': -39.56433370146168, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.56)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: -9.16117287125
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 13, 't': 12, 'action': 'forward', 'reward': -9.161172871251201, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent attempted driving forward through a red light. (rewarded -9.16)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: right, reward: 0.592507621093
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'right', 'reward': 0.5925076210933986, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.59)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: right, reward: 1.56168057288
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 11, 't': 14, 'action': 'right', 'reward': 1.5616805728771777, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove right instead of left. (rewarded 1.56)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: right, reward: -19.0915235872
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'right', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('left', 'red', 'right', 'forward'), 'deadline': 10, 't': 15, 'action': 'right', 'reward': -19.091523587211576, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.09)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: 0.685411500118
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 9, 't': 16, 'action': 'forward', 'reward': 0.6854115001184727, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 0.69)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: None, reward: 0.726396888749
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 8, 't': 17, 'action': None, 'reward': 0.7263968887485688, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 0.73)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: None, reward: 0.760595839782
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 7, 't': 18, 'action': None, 'reward': 0.7605958397817467, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.76)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: forward, reward: 0.543654300813
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 6, 't': 19, 'action': 'forward', 'reward': 0.5436543008126139, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove forward instead of left. (rewarded 0.54)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: forward, reward: 1.53317349423
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 5, 't': 20, 'action': 'forward', 'reward': 1.5331734942346535, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.53)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: left, reward: -10.4977918133
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 4, 't': 21, 'action': 'left', 'reward': -10.497791813253533, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -10.50)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: None, reward: -4.51766941071
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 3, 't': 22, 'action': None, 'reward': -4.517669410706278, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.52)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: forward, reward: 1.07556841593
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 2, 't': 23, 'action': 'forward', 'reward': 1.0755684159268755, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.08)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 0.686342423114
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 1, 't': 24, 'action': None, 'reward': 0.6863424231135942, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.69)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 14
\-------------------------
Environment.reset(): Trial set up with start = (5, 5), destination = (1, 7), deadline = 30
Simulating trial. . .
epsilon = 0.8781; alpha = 0.2200
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: forward, reward: 1.62771799558
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'right'), 'deadline': 30, 't': 0, 'action': 'forward', 'reward': 1.6277179955826284, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'right')
Agent drove forward instead of right. (rewarded 1.63)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: right, reward: 2.73587370408
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 29, 't': 1, 'action': 'right', 'reward': 2.7358737040780556, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.74)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: None, reward: 2.69569000823
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.6956900082286346, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.70)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: left, reward: 0.854399603557
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 27, 't': 3, 'action': 'left', 'reward': 0.8543996035574476, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded 0.85)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: left, reward: 0.498330202889
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 26, 't': 4, 'action': 'left', 'reward': 0.49833020288859997, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 0.50)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: left, reward: -19.0676804604
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 3, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 25, 't': 5, 'action': 'left', 'reward': -19.067680460427006, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.07)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: left, reward: -10.0938274142
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 24, 't': 6, 'action': 'left', 'reward': -10.093827414232711, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.09)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: left, reward: -10.3605990309
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 23, 't': 7, 'action': 'left', 'reward': -10.360599030906764, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.36)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: forward, reward: -10.9550521002
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 22, 't': 8, 'action': 'forward', 'reward': -10.95505210021822, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent attempted driving forward through a red light. (rewarded -10.96)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: right, reward: 1.14201466623
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 21, 't': 9, 'action': 'right', 'reward': 1.1420146662302757, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.14)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: None, reward: 0.403369865663
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 20, 't': 10, 'action': None, 'reward': 0.40336986566331845, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.40)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: right, reward: 1.65552496501
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 1.6555249650074804, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.66)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: None, reward: -4.83557937782
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 12, 'action': None, 'reward': -4.835579377818799, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.84)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: right, reward: -19.7221585788
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 17, 't': 13, 'action': 'right', 'reward': -19.722158578836748, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.72)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: left, reward: -9.2843643166
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 14, 'action': 'left', 'reward': -9.284364316595932, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.28)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: forward, reward: 2.03248564044
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 15, 'action': 'forward', 'reward': 2.0324856404359375, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.03)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: right, reward: 1.26950981413
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 16, 'action': 'right', 'reward': 1.2695098141318861, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 1.27)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 4), heading: (0, 1), action: forward, reward: 1.06490818345
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 13, 't': 17, 'action': 'forward', 'reward': 1.0649081834546317, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove forward instead of left. (rewarded 1.06)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: left, reward: 1.76002579068
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 12, 't': 18, 'action': 'left', 'reward': 1.7600257906817414, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.76)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: left, reward: -10.808464305
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 11, 't': 19, 'action': 'left', 'reward': -10.808464304995137, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -10.81)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 2.4395079915
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 10, 't': 20, 'action': None, 'reward': 2.4395079914975506, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.44)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: forward, reward: -9.98606050309
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 9, 't': 21, 'action': 'forward', 'reward': -9.986060503093213, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -9.99)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: -5.5845218609
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 8, 't': 22, 'action': None, 'reward': -5.584521860898474, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -5.58)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: left, reward: 0.687700368759
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 7, 't': 23, 'action': 'left', 'reward': 0.6877003687590285, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 0.69)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: right, reward: 1.50483962598
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 6, 't': 24, 'action': 'right', 'reward': 1.5048396259813062, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.50)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: -0.111561123986
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 5, 't': 25, 'action': 'forward', 'reward': -0.11156112398642071, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded -0.11)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: left, reward: -39.1290951332
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 4, 't': 26, 'action': 'left', 'reward': -39.12909513320216, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.13)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: -10.0304604163
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 3, 't': 27, 'action': 'forward', 'reward': -10.030460416270369, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.03)
7% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: -10.0143673959
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 2, 't': 28, 'action': 'forward', 'reward': -10.014367395908575, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.01)
3% of time remaining to reach destination.
/-------------------
| Step 29 Results
\-------------------
Environment.step(): t = 29
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: left, reward: -40.9292355674
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 1, 't': 29, 'action': 'left', 'reward': -40.92923556740313, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.93)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 15
\-------------------------
Environment.reset(): Trial set up with start = (4, 7), destination = (7, 6), deadline = 20
Simulating trial. . .
epsilon = 0.8694; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: None, reward: -4.69577740668
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 20, 't': 0, 'action': None, 'reward': -4.695777406675048, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.70)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: None, reward: -5.89848609098
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': None, 'reward': -5.898486090984482, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.90)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: right, reward: 2.22318601105
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.223186011050994, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.22)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: 1.33273628151
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.3327362815122876, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.33)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: 1.75924490911
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.7592449091143607, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.76)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: right, reward: 1.44222069062
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 1.44222069061657, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 1.44)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: right, reward: 0.454156337125
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 0.4541563371254682, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.45)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: right, reward: 1.31265454242
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 1.3126545424198668, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.31)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: left, reward: -0.118539271066
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 12, 't': 8, 'action': 'left', 'reward': -0.1185392710660883, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent drove left instead of right. (rewarded -0.12)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: left, reward: 0.0631234271107
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 11, 't': 9, 'action': 'left', 'reward': 0.06312342711067975, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent drove left instead of forward. (rewarded 0.06)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: right, reward: 1.21339134003
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 1.2133913400326795, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.21)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: right, reward: -0.0107785125579
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': -0.010778512557939157, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded -0.01)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: None, reward: -5.73317433649
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 8, 't': 12, 'action': None, 'reward': -5.733174336491689, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.73)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: left, reward: -9.75056314593
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 7, 't': 13, 'action': 'left', 'reward': -9.75056314593423, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.75)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: forward, reward: -9.30099536179
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 6, 't': 14, 'action': 'forward', 'reward': -9.300995361791413, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent attempted driving forward through a red light. (rewarded -9.30)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: None, reward: 2.36362478505
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 5, 't': 15, 'action': None, 'reward': 2.3636247850495224, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.36)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: forward, reward: -0.0715461899557
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 4, 't': 16, 'action': 'forward', 'reward': -0.07154618995568462, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove forward instead of left. (rewarded -0.07)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: right, reward: 0.631194287764
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 3, 't': 17, 'action': 'right', 'reward': 0.6311942877638425, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.63)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: left, reward: -19.633961185
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 3, 'light': 'green', 'state': ('right', 'green', 'forward', 'forward'), 'deadline': 2, 't': 18, 'action': 'left', 'reward': -19.633961185046672, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'forward')
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.63)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: right, reward: -20.81511785
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('right', 'red', 'forward', 'forward'), 'deadline': 1, 't': 19, 'action': 'right', 'reward': -20.81511784999483, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'red', 'forward', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.82)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 16
\-------------------------
Environment.reset(): Trial set up with start = (1, 3), destination = (6, 7), deadline = 25
Simulating trial. . .
epsilon = 0.8607; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: forward, reward: 0.579139338726
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 0.5791393387264148, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove forward instead of left. (rewarded 0.58)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: forward, reward: -39.9299095737
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': -39.92990957368481, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.93)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: right, reward: 1.79630590813
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.7963059081340622, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent drove right instead of left. (rewarded 1.80)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: forward, reward: -40.4746287429
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': -40.47462874289593, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.47)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: right, reward: 0.602104645836
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 0.602104645836209, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 0.60)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: left, reward: -10.2914319402
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': -10.291431940203722, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.29)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: left, reward: -20.0596619941
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 19, 't': 6, 'action': 'left', 'reward': -20.059661994109753, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.06)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: right, reward: 1.20505757479
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 1.2050575747891203, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.21)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: forward, reward: 2.20047979416
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 2.2004797941587757, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.20)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: right, reward: 0.584021535926
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 0.5840215359261542, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 0.58)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: right, reward: 1.80669842899
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 1.8066984289853911, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 1.81)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: right, reward: 1.69426259739
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 1.6942625973949075, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 1.69)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: left, reward: -10.2397752439
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'left', 'left'), 'deadline': 13, 't': 12, 'action': 'left', 'reward': -10.239775243858091, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'left')
Agent attempted driving left through a red light. (rewarded -10.24)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: None, reward: -5.68542653475
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 12, 't': 13, 'action': None, 'reward': -5.685426534752811, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.69)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: left, reward: 1.60194918506
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 11, 't': 14, 'action': 'left', 'reward': 1.6019491850627974, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 1.60)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 0.819360616705
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'left'), 'deadline': 10, 't': 15, 'action': 'right', 'reward': 0.819360616704635, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'left')
Agent drove right instead of left. (rewarded 0.82)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: right, reward: 0.883876152628
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'left'), 'deadline': 9, 't': 16, 'action': 'right', 'reward': 0.8838761526280057, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'left')
Agent followed the waypoint right. (rewarded 0.88)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: right, reward: 0.230839638669
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 8, 't': 17, 'action': 'right', 'reward': 0.23083963866944712, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 0.23)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: right, reward: 0.396035989694
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 7, 't': 18, 'action': 'right', 'reward': 0.3960359896935901, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.40)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: -10.4131373702
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 6, 't': 19, 'action': 'forward', 'reward': -10.413137370245524, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.41)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: left, reward: -9.65627925446
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 5, 't': 20, 'action': 'left', 'reward': -9.656279254463886, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.66)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 0.968780831489
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 4, 't': 21, 'action': 'right', 'reward': 0.9687808314890284, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 0.97)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: None, reward: -4.99175298695
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 3, 't': 22, 'action': None, 'reward': -4.991752986953297, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.99)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: left, reward: -0.23256146703
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 2, 't': 23, 'action': 'left', 'reward': -0.23256146703026315, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded -0.23)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: right, reward: 0.764213839662
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 1, 't': 24, 'action': 'right', 'reward': 0.764213839661644, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent drove right instead of forward. (rewarded 0.76)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 17
\-------------------------
Environment.reset(): Trial set up with start = (1, 2), destination = (4, 5), deadline = 30
Simulating trial. . .
epsilon = 0.8521; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: forward, reward: -9.69003886552
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 30, 't': 0, 'action': 'forward', 'reward': -9.69003886551708, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.69)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: left, reward: -9.58336265971
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 29, 't': 1, 'action': 'left', 'reward': -9.583362659713375, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.58)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: right, reward: 2.94228019303
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 28, 't': 2, 'action': 'right', 'reward': 2.9422801930271065, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 2.94)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: None, reward: 1.0801134196
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.080113419602389, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.08)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: None, reward: 2.33051202262
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 26, 't': 4, 'action': None, 'reward': 2.3305120226181586, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.33)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: right, reward: 0.281537234126
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'right', 'reward': 0.28153723412596754, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.28)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: None, reward: -5.44289483241
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 24, 't': 6, 'action': None, 'reward': -5.442894832411758, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.44)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: left, reward: 1.95770974988
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 23, 't': 7, 'action': 'left', 'reward': 1.95770974988389, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 1.96)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: right, reward: 1.58502263312
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 8, 'action': 'right', 'reward': 1.5850226331185464, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 1.59)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: right, reward: 1.79463539354
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 21, 't': 9, 'action': 'right', 'reward': 1.7946353935446857, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 1.79)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: None, reward: 1.47946214839
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 10, 'action': None, 'reward': 1.479462148392519, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.48)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: forward, reward: -39.2177002895
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 19, 't': 11, 'action': 'forward', 'reward': -39.21770028946386, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.22)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: forward, reward: -10.2931407434
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 12, 'action': 'forward', 'reward': -10.293140743382846, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.29)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: left, reward: 1.08687148979
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 13, 'action': 'left', 'reward': 1.0868714897895235, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.09)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: 2.29352133159
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 14, 'action': None, 'reward': 2.293521331591752, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.29)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: left, reward: -40.7576134948
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 15, 'action': 'left', 'reward': -40.75761349476219, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.76)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: -5.30583793171
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 14, 't': 16, 'action': None, 'reward': -5.305837931707314, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.31)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: -5.72700436033
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 13, 't': 17, 'action': None, 'reward': -5.727004360332228, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -5.73)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: left, reward: 2.14659445204
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 12, 't': 18, 'action': 'left', 'reward': 2.1465944520381193, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.15)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 1.78564224985
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 19, 'action': None, 'reward': 1.7856422498539168, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.79)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: forward, reward: -10.6205519309
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 10, 't': 20, 'action': 'forward', 'reward': -10.620551930868206, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent attempted driving forward through a red light. (rewarded -10.62)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: left, reward: -40.1066924239
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 9, 't': 21, 'action': 'left', 'reward': -40.1066924239281, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.11)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: -4.73084867394
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 8, 't': 22, 'action': None, 'reward': -4.730848673940432, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.73)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: left, reward: 1.01355712483
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 7, 't': 23, 'action': 'left', 'reward': 1.0135571248308457, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.01)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: right, reward: 1.52116965344
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 6, 't': 24, 'action': 'right', 'reward': 1.5211696534353658, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.52)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: None, reward: 0.0412459382192
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 5, 't': 25, 'action': None, 'reward': 0.04124593821919542, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.04)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: forward, reward: -0.660604461523
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 4, 't': 26, 'action': 'forward', 'reward': -0.6606044615225735, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove forward instead of right. (rewarded -0.66)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: None, reward: -4.66230607971
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 3, 't': 27, 'action': None, 'reward': -4.662306079706937, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.66)
7% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: right, reward: 1.85566591338
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 2, 't': 28, 'action': 'right', 'reward': 1.8556659133775646, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.86)
3% of time remaining to reach destination.
/-------------------
| Step 29 Results
\-------------------
Environment.step(): t = 29
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: None, reward: -0.89568086197
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 1, 't': 29, 'action': None, 'reward': -0.8956808619701175, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'red', None, 'right')
Agent properly idled at a red light. (rewarded -0.90)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 18
\-------------------------
Environment.reset(): Trial set up with start = (8, 3), destination = (4, 4), deadline = 25
Simulating trial. . .
epsilon = 0.8437; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: None, reward: 1.78823646918
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.788236469183009, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 1.79)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: left, reward: -10.9115974449
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': 'left', 'reward': -10.91159744492545, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.91)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: right, reward: 1.66068596498
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.6606859649810575, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent drove right instead of left. (rewarded 1.66)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: forward, reward: 1.20875583048
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.2087558304837664, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 1.21)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: right, reward: 1.69902921683
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 1.699029216834972, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.70)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: left, reward: -40.9604210215
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 20, 't': 5, 'action': 'left', 'reward': -40.960421021486816, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.96)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: -10.0928874913
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': -10.092887491338857, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.09)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: left, reward: 0.699790103635
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 0.6997901036353829, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent drove left instead of forward. (rewarded 0.70)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: right, reward: 2.35710645932
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 2.357106459315541, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.36)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: -4.87162204466
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': -4.871622044655065, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.87)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: -5.30172802986
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 15, 't': 10, 'action': None, 'reward': -5.301728029856651, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.30)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: -4.08334646695
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 14, 't': 11, 'action': None, 'reward': -4.083346466946423, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.08)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: -5.77619624903
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 13, 't': 12, 'action': None, 'reward': -5.776196249033103, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.78)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: -9.0185409352
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': -9.018540935197107, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.02)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: left, reward: -10.8997459334
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 14, 'action': 'left', 'reward': -10.899745933402215, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.90)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: left, reward: -10.5157943435
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 10, 't': 15, 'action': 'left', 'reward': -10.515794343475578, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.52)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 0.758127929612
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 9, 't': 16, 'action': None, 'reward': 0.7581279296115468, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.76)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: -4.80988352706
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 8, 't': 17, 'action': None, 'reward': -4.809883527055189, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.81)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: -4.49764855368
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 7, 't': 18, 'action': None, 'reward': -4.497648553679191, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.50)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: -4.24642887363
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 6, 't': 19, 'action': None, 'reward': -4.246428873633147, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.25)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: -5.91505587964
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 5, 't': 20, 'action': None, 'reward': -5.915055879644429, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.92)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: -40.5838090553
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 4, 't': 21, 'action': 'forward', 'reward': -40.58380905534845, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.58)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: -10.7524785268
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 3, 't': 22, 'action': 'forward', 'reward': -10.752478526756542, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.75)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: -9.80633130105
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 2, 't': 23, 'action': 'forward', 'reward': -9.806331301049216, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.81)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 1.43306646478
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 1, 't': 24, 'action': None, 'reward': 1.4330664647754978, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.43)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 19
\-------------------------
Environment.reset(): Trial set up with start = (2, 4), destination = (7, 3), deadline = 20
Simulating trial. . .
epsilon = 0.8353; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: left, reward: -9.18683421751
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': -9.18683421750584, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent attempted driving left through a red light. (rewarded -9.19)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: right, reward: 1.00170922869
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.0017092286890024, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent drove right instead of left. (rewarded 1.00)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: right, reward: 2.4727967612
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.4727967612021624, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.47)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 2.65470426153
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.6547042615335803, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.65)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 2.64929098515
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.649290985152001, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.65)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: left, reward: -19.5786165968
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': -19.578616596818925, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.58)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: right, reward: 0.0190826126024
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 0.019082612602387083, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent drove right instead of forward. (rewarded 0.02)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: None, reward: -5.69708275062
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 13, 't': 7, 'action': None, 'reward': -5.697082750619051, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.70)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: None, reward: -4.17481440375
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 12, 't': 8, 'action': None, 'reward': -4.174814403750311, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.17)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: None, reward: 2.72672081682
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 11, 't': 9, 'action': None, 'reward': 2.726720816822634, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.73)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: None, reward: 1.62560473495
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.6256047349451, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.63)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: forward, reward: 1.65712813181
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 1.6571281318111035, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove forward instead of left. (rewarded 1.66)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: right, reward: 1.53554589932
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 1.5355458993195268, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent drove right instead of left. (rewarded 1.54)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: -9.6070801622
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': -9.607080162198837, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.61)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 2.06621369201
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 6, 't': 14, 'action': 'right', 'reward': 2.066213692012928, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.07)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: None, reward: -5.07924580223
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 5, 't': 15, 'action': None, 'reward': -5.079245802234949, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.08)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: left, reward: -10.6054688657
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 4, 't': 16, 'action': 'left', 'reward': -10.605468865660972, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -10.61)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: left, reward: -39.7512634635
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 3, 't': 17, 'action': 'left', 'reward': -39.75126346354794, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.75)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: right, reward: 1.08520667926
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 2, 't': 18, 'action': 'right', 'reward': 1.0852066792624417, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.09)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: left, reward: -39.7133196432
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 1, 't': 19, 'action': 'left', 'reward': -39.713319643170436, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.71)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 20
\-------------------------
Environment.reset(): Trial set up with start = (7, 6), destination = (5, 2), deadline = 20
Simulating trial. . .
epsilon = 0.8270; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: None, reward: 2.64661947875
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.646619478752999, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.65)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: right, reward: 1.51349071005
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.513490710052427, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent drove right instead of left. (rewarded 1.51)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: right, reward: 1.58175288071
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.5817528807088588, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.58)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: right, reward: 1.450966587
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.4509665870048747, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.45)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 1.61674601267
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.6167460126735402, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent followed the waypoint forward. (rewarded 1.62)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: 1.71199221515
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.7119922151504012, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.71)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: -10.0872795603
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': -10.087279560263148, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.09)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: -5.84529429507
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': None, 'reward': -5.845294295069445, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.85)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: 2.29954209774
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 2.299542097742093, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.30)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: left, reward: 1.44399852687
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'left', 'reward': 1.4439985268687732, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.44)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 21
\-------------------------
Environment.reset(): Trial set up with start = (1, 4), destination = (6, 3), deadline = 20
Simulating trial. . .
epsilon = 0.8187; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: left, reward: 0.30357606421
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'right'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 0.30357606420967054, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'right')
Agent drove left instead of right. (rewarded 0.30)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: -5.68808695455
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': None, 'reward': -5.688086954550634, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.69)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: right, reward: 1.01525349496
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.015253494959051, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove right instead of left. (rewarded 1.02)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: 0.4001620245
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 0.4001620245001948, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.40)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: forward, reward: 1.37219492681
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'right'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.372194926805638, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'right')
Agent drove forward instead of right. (rewarded 1.37)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: None, reward: -4.17020800344
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': -4.170208003438349, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.17)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: right, reward: 2.84002913953
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 2.8400291395338995, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.84)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: left, reward: 0.998557494123
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 0.9985574941226443, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.00)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: -4.88074657043
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 12, 't': 8, 'action': None, 'reward': -4.880746570429192, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.88)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: left, reward: -39.3343782417
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 11, 't': 9, 'action': 'left', 'reward': -39.334378241726306, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.33)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 1.05372365691
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 1.053723656914007, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.05)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: -5.76284725803
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 9, 't': 11, 'action': None, 'reward': -5.762847258027599, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.76)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: -4.0557044441
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 8, 't': 12, 'action': None, 'reward': -4.055704444097449, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.06)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: left, reward: -9.59434778577
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 7, 't': 13, 'action': 'left', 'reward': -9.594347785768788, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.59)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: right, reward: 0.519079042761
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 6, 't': 14, 'action': 'right', 'reward': 0.5190790427613478, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 0.52)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 5), heading: (0, -1), action: forward, reward: -0.27912071849
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': -0.2791207184897988, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded -0.28)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: forward, reward: -0.399385568335
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 4, 't': 16, 'action': 'forward', 'reward': -0.39938556833480976, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove forward instead of left. (rewarded -0.40)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: -5.26556873323
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 3, 't': 17, 'action': None, 'reward': -5.26556873323068, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.27)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: -5.58017313907
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'right', 'left'), 'deadline': 2, 't': 18, 'action': None, 'reward': -5.58017313907089, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.58)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: 1.25344024161
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 1, 't': 19, 'action': None, 'reward': 1.2534402416076136, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.25)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 22
\-------------------------
Environment.reset(): Trial set up with start = (2, 3), destination = (4, 7), deadline = 20
Simulating trial. . .
epsilon = 0.8106; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: None, reward: 1.27671340215
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.2767134021451536, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.28)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: forward, reward: -10.2741245871
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': -10.274124587078312, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.27)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: None, reward: 2.68599821105
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.685998211050414, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.69)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: None, reward: 1.60949309407
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.6094930940734142, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.61)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: None, reward: 1.25911370045
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.2591137004549662, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.26)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: None, reward: 2.4623073842
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.4623073842037924, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.46)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: right, reward: 1.58283348143
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.5828334814343465, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove right instead of left. (rewarded 1.58)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: right, reward: 1.22464505716
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 1.224645057157869, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.22)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: left, reward: 1.30805410556
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 12, 't': 8, 'action': 'left', 'reward': 1.3080541055643597, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 1.31)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: 1.36867028376
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 1.3686702837598805, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 1.37)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: right, reward: 1.71268506798
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 1.712685067975924, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove right instead of forward. (rewarded 1.71)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: forward, reward: 1.31566697645
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 1.3156669764459126, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 1.32)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: right, reward: 1.46824033154
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 1.4682403315428534, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 1.47)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: right, reward: -0.301716566703
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 7, 't': 13, 'action': 'right', 'reward': -0.301716566702792, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded -0.30)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: right, reward: 0.482443349591
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 6, 't': 14, 'action': 'right', 'reward': 0.4824433495911543, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove right instead of left. (rewarded 0.48)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 1.41692031729
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': 1.416920317290379, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.42)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: -10.316301952
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 4, 't': 16, 'action': 'forward', 'reward': -10.316301951999497, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.32)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: 0.365007842906
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 3, 't': 17, 'action': 'forward', 'reward': 0.365007842906208, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.37)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: None, reward: -5.45862908431
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 2, 't': 18, 'action': None, 'reward': -5.458629084307947, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.46)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: forward, reward: 1.5702875218
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 1, 't': 19, 'action': 'forward', 'reward': 1.5702875218034593, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'green', 'right', 'left')
Agent followed the waypoint forward. (rewarded 1.57)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 23
\-------------------------
Environment.reset(): Trial set up with start = (5, 3), destination = (1, 5), deadline = 30
Simulating trial. . .
epsilon = 0.8025; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: forward, reward: 0.0672296556065
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 30, 't': 0, 'action': 'forward', 'reward': 0.06722965560651817, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove forward instead of left. (rewarded 0.07)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 1.81772105506
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.817721055057706, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.82)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: left, reward: -40.6703451518
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 28, 't': 2, 'action': 'left', 'reward': -40.67034515177809, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.67)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: forward, reward: -10.5314245988
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 27, 't': 3, 'action': 'forward', 'reward': -10.531424598759118, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.53)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: forward, reward: -10.8158922206
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': -10.815892220644589, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent attempted driving forward through a red light. (rewarded -10.82)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 1.16675378132
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 25, 't': 5, 'action': None, 'reward': 1.1667537813184572, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.17)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: right, reward: 1.14383783806
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 24, 't': 6, 'action': 'right', 'reward': 1.1438378380643353, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.14)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: left, reward: -10.67084378
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 7, 'action': 'left', 'reward': -10.670843779971836, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.67)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: left, reward: -9.31143600473
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 8, 'action': 'left', 'reward': -9.311436004731874, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.31)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: None, reward: 2.8163062362
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 21, 't': 9, 'action': None, 'reward': 2.8163062361975446, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.82)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: forward, reward: -10.9498894651
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': -10.949889465120854, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.95)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 3), heading: (0, -1), action: right, reward: 0.998253674812
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 0.9982536748119938, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 1.00)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: forward, reward: 0.863490406247
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 18, 't': 12, 'action': 'forward', 'reward': 0.8634904062471324, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove forward instead of left. (rewarded 0.86)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: None, reward: 2.14973840317
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 13, 'action': None, 'reward': 2.1497384031666904, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.15)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: right, reward: 1.74311703955
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 16, 't': 14, 'action': 'right', 'reward': 1.7431170395481959, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent drove right instead of left. (rewarded 1.74)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: forward, reward: 1.94006888441
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 15, 't': 15, 'action': 'forward', 'reward': 1.9400688844133358, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.94)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: forward, reward: 2.41697389444
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 14, 't': 16, 'action': 'forward', 'reward': 2.416973894442998, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 2.42)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: right, reward: 0.196618970467
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 13, 't': 17, 'action': 'right', 'reward': 0.1966189704670751, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent drove right instead of forward. (rewarded 0.20)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: left, reward: 1.66161910672
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 12, 't': 18, 'action': 'left', 'reward': 1.6616191067174446, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.66)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: left, reward: -9.40333125927
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 11, 't': 19, 'action': 'left', 'reward': -9.403331259267155, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -9.40)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: left, reward: -40.2521035326
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 10, 't': 20, 'action': 'left', 'reward': -40.252103532595626, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.25)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: right, reward: 0.535660744299
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 21, 'action': 'right', 'reward': 0.5356607442994686, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 0.54)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: left, reward: 0.552500385274
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 8, 't': 22, 'action': 'left', 'reward': 0.5525003852743191, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 0.55)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: forward, reward: 0.669142823039
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 7, 't': 23, 'action': 'forward', 'reward': 0.669142823039132, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent drove forward instead of right. (rewarded 0.67)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: right, reward: 1.9151525757
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 6, 't': 24, 'action': 'right', 'reward': 1.9151525756995547, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 1.92)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: 0.0556237710856
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 5, 't': 25, 'action': None, 'reward': 0.05562377108555738, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 0.06)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: right, reward: 0.863234150719
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', 'left'), 'deadline': 4, 't': 26, 'action': 'right', 'reward': 0.8632341507193115, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'left')
Agent followed the waypoint right. (rewarded 0.86)
10% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 24
\-------------------------
Environment.reset(): Trial set up with start = (7, 4), destination = (2, 6), deadline = 25
Simulating trial. . .
epsilon = 0.7945; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 5), heading: (0, 1), action: forward, reward: 1.61549017547
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 1.6154901754732545, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent drove forward instead of left. (rewarded 1.62)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: right, reward: 0.754997716605
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 0.7549977166047788, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent drove right instead of left. (rewarded 0.75)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: None, reward: 2.26434940811
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.2643494081135103, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.26)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: right, reward: 1.21656566503
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 22, 't': 3, 'action': 'right', 'reward': 1.2165656650259584, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.22)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: right, reward: 2.81066371286
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 2.8106637128568632, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.81)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: forward, reward: 2.55710147196
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 2.5571014719634486, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.56)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 5), heading: (0, 1), action: right, reward: 0.0179872305515
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 0.01798723055147111, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent drove right instead of forward. (rewarded 0.02)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 5), heading: (0, 1), action: None, reward: -4.3653675693
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 18, 't': 7, 'action': None, 'reward': -4.365367569300334, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -4.37)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: forward, reward: 0.494747575536
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'left'), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 0.49474757553597704, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'left')
Agent drove forward instead of left. (rewarded 0.49)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: -5.78640188417
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 16, 't': 9, 'action': None, 'reward': -5.7864018841673825, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.79)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: left, reward: -39.1389272697
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 15, 't': 10, 'action': 'left', 'reward': -39.13892726974368, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.14)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: right, reward: -0.139169672008
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 14, 't': 11, 'action': 'right', 'reward': -0.13916967200757802, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent drove right instead of left. (rewarded -0.14)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: right, reward: 2.73404510575
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 2.734045105745688, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.73)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: left, reward: -10.7232272006
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 12, 't': 13, 'action': 'left', 'reward': -10.723227200603972, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.72)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: forward, reward: -40.7036731002
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 11, 't': 14, 'action': 'forward', 'reward': -40.703673100167165, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.70)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 0.867714047128
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'forward'), 'deadline': 10, 't': 15, 'action': None, 'reward': 0.8677140471278384, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 0.87)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: right, reward: 0.895870866011
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 9, 't': 16, 'action': 'right', 'reward': 0.8958708660107477, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 0.90)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: right, reward: 0.525524991101
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 8, 't': 17, 'action': 'right', 'reward': 0.5255249911010025, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 0.53)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 2.09711737348
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 7, 't': 18, 'action': None, 'reward': 2.097117373482631, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.10)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: forward, reward: 0.275481305116
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 6, 't': 19, 'action': 'forward', 'reward': 0.2754813051162809, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove forward instead of left. (rewarded 0.28)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: right, reward: 0.378932901755
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 5, 't': 20, 'action': 'right', 'reward': 0.3789329017547318, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.38)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: left, reward: -20.0469078596
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 3, 'light': 'green', 'state': ('right', 'green', 'forward', 'left'), 'deadline': 4, 't': 21, 'action': 'left', 'reward': -20.046907859568257, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'left')
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.05)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: left, reward: -39.298797148
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 3, 't': 22, 'action': 'left', 'reward': -39.29879714799204, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.30)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: left, reward: -39.7824469565
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 2, 't': 23, 'action': 'left', 'reward': -39.782446956488855, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.78)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: left, reward: -10.724166813
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 1, 't': 24, 'action': 'left', 'reward': -10.724166812963926, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -10.72)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 25
\-------------------------
Environment.reset(): Trial set up with start = (6, 6), destination = (8, 4), deadline = 20
Simulating trial. . .
epsilon = 0.7866; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: forward, reward: -39.7384752798
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'forward', 'forward'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': -39.73847527984307, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.74)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: right, reward: 1.43517758654
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.4351775865401741, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.44)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: left, reward: -19.401781635
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 18, 't': 2, 'action': 'left', 'reward': -19.40178163500214, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.40)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: left, reward: 0.792706850268
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 0.7927068502677354, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent drove left instead of forward. (rewarded 0.79)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: right, reward: 2.80292563715
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'left'), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 2.802925637147001, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'left')
Agent followed the waypoint right. (rewarded 2.80)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: right, reward: -19.6697542508
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 15, 't': 5, 'action': 'right', 'reward': -19.669754250838718, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.67)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 2.4513008179
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.451300817895997, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.45)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 2.60463613276
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 13, 't': 7, 'action': None, 'reward': 2.6046361327604437, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.60)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: left, reward: -10.0978961612
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 12, 't': 8, 'action': 'left', 'reward': -10.097896161208407, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent attempted driving left through a red light. (rewarded -10.10)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: -4.28571065505
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 11, 't': 9, 'action': None, 'reward': -4.285710655050162, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -4.29)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: -4.63091877737
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 10, 't': 10, 'action': None, 'reward': -4.630918777365774, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -4.63)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: 1.67622659503
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 1.6762265950271458, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove forward instead of left. (rewarded 1.68)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: None, reward: 1.88053892853
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 8, 't': 12, 'action': None, 'reward': 1.8805389285318823, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.88)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: right, reward: 1.03495243047
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 7, 't': 13, 'action': 'right', 'reward': 1.0349524304701552, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove right instead of left. (rewarded 1.03)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: right, reward: 2.10276250154
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 6, 't': 14, 'action': 'right', 'reward': 2.102762501543191, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.10)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: -10.3512137768
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': -10.351213776827834, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -10.35)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: forward, reward: 0.0830458526097
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 4, 't': 16, 'action': 'forward', 'reward': 0.08304585260968556, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent drove forward instead of right. (rewarded 0.08)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: right, reward: 1.0003487817
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 3, 't': 17, 'action': 'right', 'reward': 1.0003487817009011, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.00)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: left, reward: 0.342056667345
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 2, 't': 18, 'action': 'left', 'reward': 0.34205666734489526, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent drove left instead of right. (rewarded 0.34)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: right, reward: 0.871161002174
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 1, 't': 19, 'action': 'right', 'reward': 0.8711610021736298, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 0.87)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 26
\-------------------------
Environment.reset(): Trial set up with start = (3, 5), destination = (6, 4), deadline = 20
Simulating trial. . .
epsilon = 0.7788; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: left, reward: 0.564620763879
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 0.5646207638790617, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 0.56)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: right, reward: 0.467349367462
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 0.46734936746189504, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent drove right instead of forward. (rewarded 0.47)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: left, reward: -10.8195861105
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': -10.819586110519179, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.82)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: forward, reward: -10.4615409499
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': -10.461540949850942, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.46)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: right, reward: 1.42693858098
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 1.4269385809782702, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove right instead of left. (rewarded 1.43)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 2.00833380126
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.008333801262212, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.01)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: right, reward: 0.765753741731
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 0.7657537417313816, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.77)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: None, reward: -4.70938295936
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 13, 't': 7, 'action': None, 'reward': -4.709382959364141, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.71)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: forward, reward: 0.604895333137
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 0.6048953331371637, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 0.60)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: forward, reward: -10.3566569543
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': -10.356656954335921, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.36)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: left, reward: -9.42505235233
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 10, 't': 10, 'action': 'left', 'reward': -9.425052352325473, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.43)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: left, reward: 1.7976359226
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 1.7976359226040617, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.80)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: None, reward: 1.64295555338
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 8, 't': 12, 'action': None, 'reward': 1.6429555533848592, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.64)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: forward, reward: 2.31738543355
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 2.31738543354544, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.32)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: right, reward: 0.969352930643
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 6, 't': 14, 'action': 'right', 'reward': 0.9693529306432219, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent drove right instead of forward. (rewarded 0.97)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: forward, reward: 1.30452880403
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': 1.3045288040296268, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 1.30)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: left, reward: 2.15328031508
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 4, 't': 16, 'action': 'left', 'reward': 2.1532803150778124, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.15)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: left, reward: -39.2450806862
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 3, 't': 17, 'action': 'left', 'reward': -39.24508068622416, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.25)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: forward, reward: -9.20152410492
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 2, 't': 18, 'action': 'forward', 'reward': -9.201524104920082, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent attempted driving forward through a red light. (rewarded -9.20)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: right, reward: 1.57421007456
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 1, 't': 19, 'action': 'right', 'reward': 1.5742100745585055, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.57)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 27
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (3, 5), deadline = 25
Simulating trial. . .
epsilon = 0.7711; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: right, reward: 1.81990225275
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.8199022527532795, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent drove right instead of forward. (rewarded 1.82)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: forward, reward: -9.13586979568
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': -9.13586979568383, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.14)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: None, reward: 1.67535337411
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.6753533741126858, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.68)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: right, reward: 0.285265287958
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 22, 't': 3, 'action': 'right', 'reward': 0.28526528795754325, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent drove right instead of left. (rewarded 0.29)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: None, reward: -5.63573914403
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 21, 't': 4, 'action': None, 'reward': -5.635739144030848, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.64)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: forward, reward: 0.935081132615
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 0.9350811326149597, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 0.94)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: -5.71045678805
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 19, 't': 6, 'action': None, 'reward': -5.710456788046244, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.71)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: forward, reward: -10.5618818078
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': -10.561881807777139, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent attempted driving forward through a red light. (rewarded -10.56)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: left, reward: -10.8310024158
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 8, 'action': 'left', 'reward': -10.83100241582784, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -10.83)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: left, reward: -9.64082667994
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 16, 't': 9, 'action': 'left', 'reward': -9.640826679944864, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent attempted driving left through a red light. (rewarded -9.64)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: forward, reward: 2.290354503
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 2.290354503003547, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 2.29)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 1.51821704726
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 11, 'action': None, 'reward': 1.5182170472638141, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.52)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 1.37102835899
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 12, 'action': None, 'reward': 1.3710283589948982, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.37)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 0.676030443969
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'right'), 'deadline': 12, 't': 13, 'action': 'right', 'reward': 0.676030443969377, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'right')
Agent drove right instead of forward. (rewarded 0.68)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: -5.25718093161
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 11, 't': 14, 'action': None, 'reward': -5.257180931613963, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.26)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: 1.73896204039
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 10, 't': 15, 'action': None, 'reward': 1.738962040385133, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.74)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: 2.31490533274
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 9, 't': 16, 'action': None, 'reward': 2.314905332737541, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.31)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 1.12180382211
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 8, 't': 17, 'action': 'right', 'reward': 1.1218038221086832, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove right instead of left. (rewarded 1.12)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: left, reward: 1.16375250594
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 7, 't': 18, 'action': 'left', 'reward': 1.16375250594107, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 1.16)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: left, reward: -10.8125337798
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 6, 't': 19, 'action': 'left', 'reward': -10.812533779819283, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.81)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: None, reward: 0.592963584477
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 5, 't': 20, 'action': None, 'reward': 0.5929635844766041, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.59)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: left, reward: -10.1897308596
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 4, 't': 21, 'action': 'left', 'reward': -10.189730859615288, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.19)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: forward, reward: -0.436359832012
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 3, 't': 22, 'action': 'forward', 'reward': -0.43635983201171724, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded -0.44)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: None, reward: 0.356172409038
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 2, 't': 23, 'action': None, 'reward': 0.356172409038247, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.36)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: None, reward: 1.07837311751
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 1, 't': 24, 'action': None, 'reward': 1.0783731175068263, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.08)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 28
\-------------------------
Environment.reset(): Trial set up with start = (1, 3), destination = (4, 4), deadline = 20
Simulating trial. . .
epsilon = 0.7634; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: right, reward: 2.94677393999
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.9467739399891184, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.95)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 1.12430528721
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.1243052872087338, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 1.12)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: forward, reward: -10.1084845823
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': -10.108484582283326, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.11)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: None, reward: 1.89136602142
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.8913660214167003, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.89)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: left, reward: 1.80079014848
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 1.8007901484826552, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.80)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: right, reward: 0.866134168608
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 0.8661341686077164, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent drove right instead of forward. (rewarded 0.87)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: forward, reward: -9.14069323974
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': -9.140693239744305, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent attempted driving forward through a red light. (rewarded -9.14)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: right, reward: 0.411699105069
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 0.411699105068972, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 0.41)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: left, reward: 0.307865355913
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 12, 't': 8, 'action': 'left', 'reward': 0.3078653559126213, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 0.31)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: forward, reward: -39.2681410144
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': -39.268141014437255, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.27)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: forward, reward: -39.4874368453
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': -39.4874368453235, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.49)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: left, reward: -39.7321636621
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 9, 't': 11, 'action': 'left', 'reward': -39.732163662059264, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.73)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: right, reward: 1.10391749321
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'right'), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 1.1039174932134057, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'right')
Agent drove right instead of left. (rewarded 1.10)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: 1.27861037638
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 1.2786103763810197, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent drove forward instead of right. (rewarded 1.28)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 5), heading: (0, -1), action: right, reward: 1.40059494415
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 6, 't': 14, 'action': 'right', 'reward': 1.4005949441543464, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.40)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: right, reward: 1.51722482302
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 5, 't': 15, 'action': 'right', 'reward': 1.5172248230195031, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.52)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: None, reward: 1.97162630417
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 4, 't': 16, 'action': None, 'reward': 1.9716263041675544, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.97)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: right, reward: 0.467085645141
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 3, 't': 17, 'action': 'right', 'reward': 0.46708564514093587, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 0.47)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: left, reward: 0.420221744392
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 2, 't': 18, 'action': 'left', 'reward': 0.4202217443918703, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.42)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: right, reward: 0.241612933683
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 1, 't': 19, 'action': 'right', 'reward': 0.24161293368338865, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove right instead of forward. (rewarded 0.24)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 29
\-------------------------
Environment.reset(): Trial set up with start = (4, 7), destination = (1, 5), deadline = 25
Simulating trial. . .
epsilon = 0.7558; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: left, reward: 1.39765443527
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'left'), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 1.3976544352719709, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'left')
Agent drove left instead of right. (rewarded 1.40)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: forward, reward: 1.11619723602
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 1.116197236016944, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent followed the waypoint forward. (rewarded 1.12)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: forward, reward: -9.01168574207
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': -9.01168574206705, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -9.01)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: right, reward: 1.18792637871
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'right', 'reward': 1.1879263787099064, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 1.19)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: right, reward: 0.133861318243
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 0.13386131824285574, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 0.13)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: None, reward: 0.549207917743
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 20, 't': 5, 'action': None, 'reward': 0.5492079177431647, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 0.55)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: right, reward: 1.83192416772
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 1.8319241677218456, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.83)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: None, reward: -4.76925891549
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'forward', 'forward'), 'deadline': 18, 't': 7, 'action': None, 'reward': -4.769258915493865, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.77)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: right, reward: 2.71193724247
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 2.711937242465348, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.71)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: right, reward: -19.4420116509
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 16, 't': 9, 'action': 'right', 'reward': -19.4420116509038, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.44)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: forward, reward: -10.1587360795
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': -10.158736079485205, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent attempted driving forward through a red light. (rewarded -10.16)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: None, reward: -5.82610700465
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 11, 'action': None, 'reward': -5.826107004651351, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.83)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: None, reward: -4.10860893947
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 13, 't': 12, 'action': None, 'reward': -4.108608939474145, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -4.11)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: right, reward: -0.0782679023473
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 13, 'action': 'right', 'reward': -0.07826790234733183, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded -0.08)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: left, reward: 0.73483626893
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 11, 't': 14, 'action': 'left', 'reward': 0.7348362689295138, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 0.73)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: None, reward: -5.80679888667
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 10, 't': 15, 'action': None, 'reward': -5.806798886666943, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.81)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: 2.20811569796
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 9, 't': 16, 'action': 'forward', 'reward': 2.2081156979582266, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.21)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: -40.9282167582
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 8, 't': 17, 'action': 'forward', 'reward': -40.92821675822916, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.93)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: right, reward: -0.0813733775134
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 7, 't': 18, 'action': 'right', 'reward': -0.08137337751341556, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent drove right instead of forward. (rewarded -0.08)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: forward, reward: 1.39304487684
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 6, 't': 19, 'action': 'forward', 'reward': 1.3930448768370631, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove forward instead of left. (rewarded 1.39)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: None, reward: 1.41343653029
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 5, 't': 20, 'action': None, 'reward': 1.4134365302882064, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.41)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: left, reward: -10.4596709204
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 4, 't': 21, 'action': 'left', 'reward': -10.459670920431966, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -10.46)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: right, reward: 0.931394195228
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 3, 't': 22, 'action': 'right', 'reward': 0.9313941952279283, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove right instead of left. (rewarded 0.93)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (7, 5), heading: (0, 1), action: left, reward: 1.301567937
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 2, 't': 23, 'action': 'left', 'reward': 1.301567936999273, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.30)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (7, 5), heading: (0, 1), action: left, reward: -20.5543253763
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 1, 't': 24, 'action': 'left', 'reward': -20.554325376298355, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.55)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 30
\-------------------------
Environment.reset(): Trial set up with start = (5, 7), destination = (1, 4), deadline = 35
Simulating trial. . .
epsilon = 0.7483; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: left, reward: 2.10215730263
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 35, 't': 0, 'action': 'left', 'reward': 2.1021573026267935, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 2.10)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: left, reward: -10.1098822328
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', 'right'), 'deadline': 34, 't': 1, 'action': 'left', 'reward': -10.109882232773016, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'right')
Agent attempted driving left through a red light. (rewarded -10.11)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: None, reward: 1.06083383898
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 33, 't': 2, 'action': None, 'reward': 1.0608338389838687, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.06)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: left, reward: -10.1928176501
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 32, 't': 3, 'action': 'left', 'reward': -10.192817650070642, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.19)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: right, reward: 0.410172689809
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 31, 't': 4, 'action': 'right', 'reward': 0.41017268980948485, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.41)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: right, reward: 1.07794318316
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 30, 't': 5, 'action': 'right', 'reward': 1.0779431831569088, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 1.08)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: None, reward: 2.54895999183
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 29, 't': 6, 'action': None, 'reward': 2.5489599918323274, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.55)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: None, reward: -4.70407559768
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 28, 't': 7, 'action': None, 'reward': -4.704075597676111, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.70)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: right, reward: 1.36556104062
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 27, 't': 8, 'action': 'right', 'reward': 1.3655610406185343, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 1.37)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: -4.0641921171
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 26, 't': 9, 'action': None, 'reward': -4.064192117097113, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.06)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: left, reward: 0.506414890242
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 25, 't': 10, 'action': 'left', 'reward': 0.5064148902415472, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove left instead of forward. (rewarded 0.51)
69% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: right, reward: 0.908788053792
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 24, 't': 11, 'action': 'right', 'reward': 0.9087880537917077, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 0.91)
66% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: None, reward: 2.47076746722
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 12, 'action': None, 'reward': 2.4707674672186304, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.47)
63% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: left, reward: 0.923138674343
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 13, 'action': 'left', 'reward': 0.9231386743425657, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 0.92)
60% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: left, reward: -40.7909096826
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'forward', 'left'), 'deadline': 21, 't': 14, 'action': 'left', 'reward': -40.790909682555345, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'left')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.79)
57% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: right, reward: 1.47709164784
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 20, 't': 15, 'action': 'right', 'reward': 1.4770916478445497, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.48)
54% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: forward, reward: 1.30940108366
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 16, 'action': 'forward', 'reward': 1.30940108366053, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.31)
51% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: 2.72191337063
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 18, 't': 17, 'action': 'forward', 'reward': 2.7219133706262193, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 2.72)
49% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: left, reward: -10.1784684175
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 17, 't': 18, 'action': 'left', 'reward': -10.178468417499964, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -10.18)
46% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: left, reward: 2.24823993432
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 19, 'action': 'left', 'reward': 2.2482399343248156, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.25)
43% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 31
\-------------------------
Environment.reset(): Trial set up with start = (3, 6), destination = (5, 3), deadline = 25
Simulating trial. . .
epsilon = 0.7408; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 2.23440627261
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 25, 't': 0, 'action': None, 'reward': 2.2344062726124756, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.23)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: right, reward: -20.150348195
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 24, 't': 1, 'action': 'right', 'reward': -20.150348195031647, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.15)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 1.08120190525
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.0812019052546535, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.08)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: right, reward: 1.88073640013
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'right', 'reward': 1.8807364001257347, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 1.88)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: forward, reward: -10.8804834498
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': -10.880483449758227, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.88)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: right, reward: 1.46611562522
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 1.4661156252248513, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 1.47)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: forward, reward: -9.34225415134
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': -9.342254151341896, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.34)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: right, reward: -0.039772094409
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': -0.039772094409014525, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded -0.04)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: left, reward: 0.739286630137
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'left'), 'deadline': 17, 't': 8, 'action': 'left', 'reward': 0.739286630137166, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'left')
Agent drove left instead of right. (rewarded 0.74)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: right, reward: 0.985216776082
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 0.9852167760822224, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent drove right instead of forward. (rewarded 0.99)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 1.10913099692
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.1091309969220944, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.11)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: right, reward: 1.53467702548
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 1.5346770254843838, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.53)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: forward, reward: -9.2233829935
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 12, 'action': 'forward', 'reward': -9.223382993504709, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.22)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: right, reward: 0.201420194325
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'right', 'reward': 0.20142019432511704, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.20)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: None, reward: -4.59670026156
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 11, 't': 14, 'action': None, 'reward': -4.596700261563953, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.60)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: left, reward: 2.50188257328
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 10, 't': 15, 'action': 'left', 'reward': 2.5018825732789107, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.50)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: 0.875169160914
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 9, 't': 16, 'action': 'forward', 'reward': 0.8751691609136989, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 0.88)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: None, reward: 1.92212339042
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 8, 't': 17, 'action': None, 'reward': 1.9221233904237147, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.92)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: -40.5165159119
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 7, 't': 18, 'action': 'forward', 'reward': -40.51651591193165, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.52)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: forward, reward: 0.724815970111
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 6, 't': 19, 'action': 'forward', 'reward': 0.7248159701108039, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 0.72)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 0.910408100797
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 5, 't': 20, 'action': None, 'reward': 0.910408100797244, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.91)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: left, reward: -0.274396888896
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 4, 't': 21, 'action': 'left', 'reward': -0.2743968888962015, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded -0.27)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: right, reward: -20.8256437015
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 3, 't': 22, 'action': 'right', 'reward': -20.825643701453025, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.83)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: right, reward: -0.755535593547
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 2, 't': 23, 'action': 'right', 'reward': -0.7555355935469681, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded -0.76)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: forward, reward: 0.366187910949
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 1, 't': 24, 'action': 'forward', 'reward': 0.3661879109491073, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 0.37)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 32
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (6, 5), deadline = 20
Simulating trial. . .
epsilon = 0.7334; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: -5.09002392765
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': -5.090023927648284, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.09)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: -5.36947458161
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 19, 't': 1, 'action': None, 'reward': -5.369474581609444, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.37)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: 1.50427653935
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 1.5042765393509487, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.50)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: left, reward: -9.23822429569
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 17, 't': 3, 'action': 'left', 'reward': -9.238224295693662, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent attempted driving left through a red light. (rewarded -9.24)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: forward, reward: 2.90950609672
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.909506096722935, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.91)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: right, reward: 1.90866861097
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 1.9086686109708317, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent drove right instead of forward. (rewarded 1.91)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: left, reward: -20.6578698366
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 14, 't': 6, 'action': 'left', 'reward': -20.657869836604032, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.66)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: left, reward: 1.82817807998
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 1.8281780799814225, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.83)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 33
\-------------------------
Environment.reset(): Trial set up with start = (3, 4), destination = (7, 3), deadline = 25
Simulating trial. . .
epsilon = 0.7261; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 2.98830371367
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 25, 't': 0, 'action': None, 'reward': 2.9883037136748722, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.99)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: left, reward: -39.0698432218
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 24, 't': 1, 'action': 'left', 'reward': -39.069843221763094, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.07)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 2.55605174167
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.5560517416722712, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.56)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 1.36698992078
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.3669899207783982, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.37)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 1.51244647077
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.5124464707656862, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.51)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: -4.96708259366
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 20, 't': 5, 'action': None, 'reward': -4.967082593660278, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.97)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: forward, reward: 1.26494223113
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 1.264942231128171, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.26)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: forward, reward: 1.40265958781
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.402659587813455, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.40)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: left, reward: 1.01081058982
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 17, 't': 8, 'action': 'left', 'reward': 1.0108105898159763, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent drove left instead of forward. (rewarded 1.01)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: right, reward: 1.29118234773
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 1.2911823477333617, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.29)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: None, reward: 2.4190642937
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 10, 'action': None, 'reward': 2.4190642936995723, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.42)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: left, reward: -10.2469045026
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 14, 't': 11, 'action': 'left', 'reward': -10.246904502644673, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent attempted driving left through a red light. (rewarded -10.25)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: None, reward: 2.25923336656
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 12, 'action': None, 'reward': 2.259233366557489, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.26)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: left, reward: -0.194224342936
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'left', 'reward': -0.19422434293633772, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded -0.19)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: right, reward: 0.815266596344
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 11, 't': 14, 'action': 'right', 'reward': 0.8152665963444745, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 0.82)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 2.21227842248
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 10, 't': 15, 'action': None, 'reward': 2.212278422483946, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.21)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: forward, reward: 1.08431481912
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 9, 't': 16, 'action': 'forward', 'reward': 1.0843148191160856, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent drove forward instead of left. (rewarded 1.08)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: left, reward: -40.5060249618
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'right', 'left'), 'deadline': 8, 't': 17, 'action': 'left', 'reward': -40.50602496178615, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'left')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.51)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: None, reward: 2.4691227702
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'left'), 'deadline': 7, 't': 18, 'action': None, 'reward': 2.4691227702014755, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'left')
Agent properly idled at a red light. (rewarded 2.47)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: right, reward: 0.494775850545
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 6, 't': 19, 'action': 'right', 'reward': 0.49477585054539974, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove right instead of left. (rewarded 0.49)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: right, reward: 1.01214824975
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 5, 't': 20, 'action': 'right', 'reward': 1.0121482497508234, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.01)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 1.38663499152
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 4, 't': 21, 'action': None, 'reward': 1.3866349915241767, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.39)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: right, reward: 1.18858760161
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 3, 't': 22, 'action': 'right', 'reward': 1.1885876016131518, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 1.19)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: right, reward: 0.391549756049
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 2, 't': 23, 'action': 'right', 'reward': 0.3915497560490009, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent drove right instead of forward. (rewarded 0.39)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: forward, reward: -10.4717873115
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 1, 't': 24, 'action': 'forward', 'reward': -10.471787311484883, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.47)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 34
\-------------------------
Environment.reset(): Trial set up with start = (8, 5), destination = (3, 3), deadline = 25
Simulating trial. . .
epsilon = 0.7189; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: forward, reward: -40.0968704185
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': -40.09687041848107, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.10)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 2.7121430852
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.7121430851958683, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.71)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: right, reward: 0.322690455865
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 0.3226904558650667, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 0.32)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: -5.47163609358
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 22, 't': 3, 'action': None, 'reward': -5.4716360935841415, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.47)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: right, reward: 0.0761696843131
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 0.07616968431310556, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent drove right instead of left. (rewarded 0.08)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: right, reward: -20.9963347917
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 20, 't': 5, 'action': 'right', 'reward': -20.99633479165683, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -21.00)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: forward, reward: 1.05324352472
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 1.0532435247225345, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove forward instead of left. (rewarded 1.05)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: right, reward: 1.61514449798
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 1.6151444979821408, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 1.62)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: left, reward: -20.7601677919
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 17, 't': 8, 'action': 'left', 'reward': -20.76016779192085, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.76)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: forward, reward: -10.4444233865
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': -10.444423386478125, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.44)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: forward, reward: -10.0339926132
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': -10.033992613199239, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.03)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: left, reward: 1.80606323613
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 14, 't': 11, 'action': 'left', 'reward': 1.8060632361259747, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.81)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: right, reward: 0.468107780755
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 0.4681077807547662, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent drove right instead of forward. (rewarded 0.47)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: right, reward: 0.968009538185
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 12, 't': 13, 'action': 'right', 'reward': 0.9680095381850817, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent drove right instead of left. (rewarded 0.97)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: left, reward: -10.6232362283
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 11, 't': 14, 'action': 'left', 'reward': -10.623236228252424, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.62)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: left, reward: 1.56307692151
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 10, 't': 15, 'action': 'left', 'reward': 1.5630769215146714, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 1.56)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: None, reward: -4.80436178996
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 9, 't': 16, 'action': None, 'reward': -4.804361789961892, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.80)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: forward, reward: -39.6144609469
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'right', 'forward'), 'deadline': 8, 't': 17, 'action': 'forward', 'reward': -39.61446094685902, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.61)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: left, reward: -10.199859746
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 7, 't': 18, 'action': 'left', 'reward': -10.19985974601159, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.20)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: None, reward: -5.72313530274
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 6, 't': 19, 'action': None, 'reward': -5.72313530273972, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.72)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: None, reward: -4.228278236
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 5, 't': 20, 'action': None, 'reward': -4.228278236002592, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.23)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: right, reward: 1.29332638227
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 4, 't': 21, 'action': 'right', 'reward': 1.2933263822680021, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent drove right instead of left. (rewarded 1.29)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: left, reward: -40.0658243158
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'right', 'right'), 'deadline': 3, 't': 22, 'action': 'left', 'reward': -40.06582431583241, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'right')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.07)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (7, 4), heading: (0, 1), action: right, reward: 0.760163770188
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 2, 't': 23, 'action': 'right', 'reward': 0.7601637701877528, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent drove right instead of forward. (rewarded 0.76)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (7, 4), heading: (0, 1), action: left, reward: -10.8885200044
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 1, 't': 24, 'action': 'left', 'reward': -10.888520004435748, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.89)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 35
\-------------------------
Environment.reset(): Trial set up with start = (1, 3), destination = (6, 6), deadline = 30
Simulating trial. . .
epsilon = 0.7118; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: None, reward: 1.72415967568
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 30, 't': 0, 'action': None, 'reward': 1.7241596756847206, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.72)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: left, reward: -9.99224590703
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 29, 't': 1, 'action': 'left', 'reward': -9.992245907033206, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.99)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: right, reward: 0.539561832287
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 28, 't': 2, 'action': 'right', 'reward': 0.5395618322872014, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 0.54)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.37899351542
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 2.3789935154219712, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.38)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.85383875807
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 26, 't': 4, 'action': None, 'reward': 1.8538387580694697, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.85)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.56198663676
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 25, 't': 5, 'action': None, 'reward': 2.5619866367566666, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.56)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: -5.66779631586
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 24, 't': 6, 'action': None, 'reward': -5.667796315855575, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.67)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 0.447562731614
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 23, 't': 7, 'action': 'right', 'reward': 0.4475627316137002, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent drove right instead of left. (rewarded 0.45)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: right, reward: 2.56037070109
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 22, 't': 8, 'action': 'right', 'reward': 2.5603707010860868, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 2.56)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: 2.78990844369
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 9, 'action': None, 'reward': 2.7899084436900337, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.79)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: 2.52882288045
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 20, 't': 10, 'action': None, 'reward': 2.528822880452803, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.53)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: right, reward: 0.761154598869
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 0.7611545988689384, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 0.76)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: right, reward: 1.27545216201
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 18, 't': 12, 'action': 'right', 'reward': 1.2754521620088943, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.28)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: -9.59980242627
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': -9.59980242627371, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -9.60)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: -9.8012926339
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 16, 't': 14, 'action': 'forward', 'reward': -9.801292633898159, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent attempted driving forward through a red light. (rewarded -9.80)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 1.72508741006
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 15, 't': 15, 'action': 'right', 'reward': 1.725087410059026, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent drove right instead of left. (rewarded 1.73)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: right, reward: 2.19834782646
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'left'), 'deadline': 14, 't': 16, 'action': 'right', 'reward': 2.198347826459878, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'left')
Agent followed the waypoint right. (rewarded 2.20)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: 0.939191418335
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 17, 'action': None, 'reward': 0.939191418335017, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.94)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: forward, reward: -40.9088438666
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 18, 'action': 'forward', 'reward': -40.90884386664623, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.91)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: left, reward: 0.732646393942
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 19, 'action': 'left', 'reward': 0.7326463939419324, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 0.73)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: right, reward: 0.898224241495
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', 'left'), 'deadline': 10, 't': 20, 'action': 'right', 'reward': 0.8982242414946178, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'left')
Agent followed the waypoint right. (rewarded 0.90)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 1.91036147914
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 21, 'action': 'forward', 'reward': 1.910361479138242, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.91)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: -9.07493837537
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 8, 't': 22, 'action': 'forward', 'reward': -9.074938375370952, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent attempted driving forward through a red light. (rewarded -9.07)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: None, reward: 2.22975209543
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 7, 't': 23, 'action': None, 'reward': 2.2297520954303467, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.23)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: None, reward: 1.99303248461
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 6, 't': 24, 'action': None, 'reward': 1.993032484612784, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.99)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (7, 4), heading: (0, -1), action: right, reward: 0.573714216038
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 5, 't': 25, 'action': 'right', 'reward': 0.5737142160377886, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 0.57)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: forward, reward: 1.07507755812
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 4, 't': 26, 'action': 'forward', 'reward': 1.0750775581208412, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 1.08)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: forward, reward: -39.7558586994
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 3, 't': 27, 'action': 'forward', 'reward': -39.755858699422824, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.76)
7% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: forward, reward: -10.6580612938
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 2, 't': 28, 'action': 'forward', 'reward': -10.658061293827112, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent attempted driving forward through a red light. (rewarded -10.66)
3% of time remaining to reach destination.
/-------------------
| Step 29 Results
\-------------------
Environment.step(): t = 29
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: right, reward: -0.838232091605
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 1, 't': 29, 'action': 'right', 'reward': -0.8382320916048058, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'green', 'right', None)
Agent drove right instead of left. (rewarded -0.84)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 36
\-------------------------
Environment.reset(): Trial set up with start = (1, 5), destination = (6, 7), deadline = 25
Simulating trial. . .
epsilon = 0.7047; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: right, reward: -19.3416456127
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('right', 'red', 'right', 'forward'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': -19.341645612684275, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.34)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: -40.2717448889
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': -40.271744888917404, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.27)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: right, reward: 1.50745644719
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.507456447191706, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.51)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: left, reward: 0.904681569145
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'left', 'reward': 0.9046815691450192, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 0.90)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: left, reward: -20.9882068998
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 3, 'light': 'green', 'state': ('right', 'green', 'forward', 'left'), 'deadline': 21, 't': 4, 'action': 'left', 'reward': -20.98820689976919, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'left')
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.99)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: left, reward: -9.8750903346
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 20, 't': 5, 'action': 'left', 'reward': -9.875090334602625, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -9.88)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: right, reward: 2.52573601193
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 2.5257360119335823, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.53)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: left, reward: -0.0734276997754
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 18, 't': 7, 'action': 'left', 'reward': -0.07342769977536845, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent drove left instead of right. (rewarded -0.07)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 2.15077525594
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.1507752559377886, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.15)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: -5.78007940641
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 9, 'action': None, 'reward': -5.780079406411891, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.78)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: forward, reward: 1.42287030526
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 1.422870305257428, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.42)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: None, reward: 2.29099085757
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 11, 'action': None, 'reward': 2.2909908575728797, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.29)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: left, reward: -10.7119565071
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 12, 'action': 'left', 'reward': -10.711956507123276, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.71)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: forward, reward: 2.28077329828
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 2.2807732982792577, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.28)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: 1.51468528905
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 14, 'action': None, 'reward': 1.5146852890494875, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.51)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: 1.47872859501
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 10, 't': 15, 'action': None, 'reward': 1.4787285950105817, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.48)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: left, reward: -9.08306141851
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 9, 't': 16, 'action': 'left', 'reward': -9.083061418505984, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.08)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: forward, reward: 0.841294232981
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 8, 't': 17, 'action': 'forward', 'reward': 0.8412942329806463, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 0.84)
28% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 37
\-------------------------
Environment.reset(): Trial set up with start = (6, 5), destination = (2, 4), deadline = 25
Simulating trial. . .
epsilon = 0.6977; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: left, reward: -40.1035523539
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 25, 't': 0, 'action': 'left', 'reward': -40.10355235392594, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.10)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: right, reward: 2.51603953446
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 2.5160395344582343, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.52)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: right, reward: 1.37535756364
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.3753575636415383, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent drove right instead of forward. (rewarded 1.38)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: forward, reward: 0.14647633132
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 0.14647633132047622, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove forward instead of left. (rewarded 0.15)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: left, reward: 1.42325802088
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 1.4232580208797907, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.42)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: right, reward: 1.54750125448
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 1.5475012544809221, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 1.55)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: right, reward: 1.68744092472
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 1.6874409247186741, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove right instead of left. (rewarded 1.69)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: None, reward: -5.17791382489
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': None, 'reward': -5.1779138248858345, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.18)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: left, reward: 1.52570814412
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'left', 'reward': 1.5257081441243747, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.53)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: left, reward: 2.72038889244
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 16, 't': 9, 'action': 'left', 'reward': 2.7203888924392334, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.72)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 2.14580868416
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 10, 'action': None, 'reward': 2.145808684161887, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.15)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: forward, reward: -40.071957148
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': -40.07195714795724, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.07)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 2.23664271496
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 13, 't': 12, 'action': None, 'reward': 2.236642714961968, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.24)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 2.66610638533
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 2.6661063853275087, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.67)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: left, reward: 1.38886860713
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 11, 't': 14, 'action': 'left', 'reward': 1.3888686071290084, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent drove left instead of forward. (rewarded 1.39)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: forward, reward: -40.6601251858
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 10, 't': 15, 'action': 'forward', 'reward': -40.66012518584467, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.66)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: right, reward: 1.11165646138
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 9, 't': 16, 'action': 'right', 'reward': 1.1116564613822952, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.11)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: right, reward: 2.30110025137
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 8, 't': 17, 'action': 'right', 'reward': 2.3011002513724415, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.30)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: left, reward: -10.8457658103
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 7, 't': 18, 'action': 'left', 'reward': -10.845765810324457, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.85)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: left, reward: -40.3367441148
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 6, 't': 19, 'action': 'left', 'reward': -40.336744114800005, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.34)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: forward, reward: -40.898710871
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 5, 't': 20, 'action': 'forward', 'reward': -40.898710870999096, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.90)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: left, reward: -20.2788869213
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 4, 't': 21, 'action': 'left', 'reward': -20.278886921277632, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.28)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: left, reward: 1.19885902161
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 3, 't': 22, 'action': 'left', 'reward': 1.198859021607507, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent drove left instead of forward. (rewarded 1.20)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: -10.3735694789
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 2, 't': 23, 'action': 'forward', 'reward': -10.373569478873021, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.37)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: left, reward: -9.39624662915
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 1, 't': 24, 'action': 'left', 'reward': -9.396246629154518, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.40)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 38
\-------------------------
Environment.reset(): Trial set up with start = (8, 6), destination = (5, 7), deadline = 20
Simulating trial. . .
epsilon = 0.6907; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: left, reward: 0.11957906559
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'left'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 0.11957906559019693, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'left')
Agent drove left instead of right. (rewarded 0.12)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 1.93018742053
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.9301874205291778, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.93)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: left, reward: -9.06330111096
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': -9.063301110964892, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.06)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: -5.44632269893
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': -5.446322698931654, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.45)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 2.90257623258
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 2.902576232581537, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.90)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: left, reward: 1.7809090247
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 1.7809090246972017, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.78)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: left, reward: 1.04424263667
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 1.0442426366670745, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 1.04)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: left, reward: 1.2725878737
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 1.272587873697483, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.27)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: None, reward: -5.63091867892
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': -5.630918678924768, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.63)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: forward, reward: -39.7885384324
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': -39.788538432411976, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.79)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: None, reward: 2.61227709947
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 10, 't': 10, 'action': None, 'reward': 2.6122770994749267, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.61)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: forward, reward: 0.946891112423
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 0.9468911124232793, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove forward instead of left. (rewarded 0.95)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: forward, reward: 0.802752694872
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': 0.8027526948716412, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 0.80)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: right, reward: 0.873714215549
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 7, 't': 13, 'action': 'right', 'reward': 0.8737142155490053, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.87)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: forward, reward: 1.7512082254
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 6, 't': 14, 'action': 'forward', 'reward': 1.7512082253953574, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.75)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 2.3675988549
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 5, 't': 15, 'action': None, 'reward': 2.3675988549040623, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.37)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: 1.22808787645
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 4, 't': 16, 'action': 'forward', 'reward': 1.2280878764519207, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.23)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: forward, reward: 0.444844576519
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 3, 't': 17, 'action': 'forward', 'reward': 0.444844576519146, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 0.44)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: forward, reward: -9.51766402339
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'right', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'right', 'left'), 'deadline': 2, 't': 18, 'action': 'forward', 'reward': -9.517664023393118, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'left')
Agent attempted driving forward through a red light. (rewarded -9.52)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: None, reward: -5.04265866891
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 1, 't': 19, 'action': None, 'reward': -5.0426586689142585, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.04)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 39
\-------------------------
Environment.reset(): Trial set up with start = (2, 7), destination = (4, 4), deadline = 25
Simulating trial. . .
epsilon = 0.6839; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: right, reward: 0.084717292789
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 0.08471729278902784, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.08)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: right, reward: 2.64520923054
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'right'), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 2.645209230538722, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'right')
Agent followed the waypoint right. (rewarded 2.65)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: -9.67971507681
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': -9.6797150768086, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.68)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: -4.2249843316
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 22, 't': 3, 'action': None, 'reward': -4.224984331598995, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.22)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: right, reward: 1.3747616688
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 1.3747616687989903, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 1.37)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: left, reward: 1.97705315088
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 1.9770531508805, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.98)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: forward, reward: 0.843779472569
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 0.8437794725692799, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove forward instead of right. (rewarded 0.84)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: right, reward: 1.23094215724
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 1.230942157239813, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 1.23)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: right, reward: 2.14589954333
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 2.145899543326919, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.15)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: right, reward: 0.0481627372879
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'right'), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 0.048162737287882784, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'right')
Agent drove right instead of left. (rewarded 0.05)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: right, reward: 1.4946021844
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 1.4946021843972574, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.49)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: left, reward: 0.369305821842
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 14, 't': 11, 'action': 'left', 'reward': 0.3693058218422821, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 0.37)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: forward, reward: -10.9300388276
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 13, 't': 12, 'action': 'forward', 'reward': -10.930038827606568, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.93)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: right, reward: 0.886895844337
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 12, 't': 13, 'action': 'right', 'reward': 0.8868958443372568, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent drove right instead of left. (rewarded 0.89)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: forward, reward: 1.48684016765
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 11, 't': 14, 'action': 'forward', 'reward': 1.4868401676544307, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove forward instead of left. (rewarded 1.49)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: left, reward: 2.37382550486
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 10, 't': 15, 'action': 'left', 'reward': 2.373825504855041, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.37)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: -4.29739689878
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 9, 't': 16, 'action': None, 'reward': -4.297396898780971, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.30)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: right, reward: 0.588599547187
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 8, 't': 17, 'action': 'right', 'reward': 0.58859954718733, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.59)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: right, reward: 0.614023370886
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 7, 't': 18, 'action': 'right', 'reward': 0.6140233708859684, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent drove right instead of forward. (rewarded 0.61)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: left, reward: 0.744403814133
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 6, 't': 19, 'action': 'left', 'reward': 0.7444038141329403, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.74)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: -5.87690084254
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 5, 't': 20, 'action': None, 'reward': -5.876900842542584, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.88)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: -0.13066054476
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 4, 't': 21, 'action': 'right', 'reward': -0.1306605447603323, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded -0.13)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: left, reward: 1.63975371146
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 3, 't': 22, 'action': 'left', 'reward': 1.6397537114615823, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.64)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: left, reward: -10.8898853302
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 2, 't': 23, 'action': 'left', 'reward': -10.889885330220164, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent attempted driving left through a red light. (rewarded -10.89)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 1.28448880246
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 1, 't': 24, 'action': None, 'reward': 1.2844888024639387, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.28)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 40
\-------------------------
Environment.reset(): Trial set up with start = (3, 6), destination = (5, 3), deadline = 25
Simulating trial. . .
epsilon = 0.6771; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: right, reward: 2.7617390434
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'right'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.76173904340218, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'right')
Agent followed the waypoint right. (rewarded 2.76)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: None, reward: -5.50626938115
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 24, 't': 1, 'action': None, 'reward': -5.5062693811519186, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.51)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: None, reward: -5.19889057515
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 23, 't': 2, 'action': None, 'reward': -5.198890575146612, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.20)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: None, reward: -4.28330642089
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 22, 't': 3, 'action': None, 'reward': -4.283306420887428, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.28)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: None, reward: 1.69350546426
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.6935054642583234, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.69)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: right, reward: 1.35349685045
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 1.3534968504539724, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 1.35)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: left, reward: -10.4423360204
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 19, 't': 6, 'action': 'left', 'reward': -10.442336020354684, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent attempted driving left through a red light. (rewarded -10.44)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: left, reward: 1.22432110454
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 1.2243211045368323, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.22)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: 0.803633257421
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 0.8036332574209638, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.80)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: -4.99028729444
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': -4.99028729444174, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.99)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: -5.87147660275
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 15, 't': 10, 'action': None, 'reward': -5.871476602751004, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.87)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: -4.91288314214
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 14, 't': 11, 'action': None, 'reward': -4.912883142138408, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.91)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: right, reward: 2.45023394196
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 2.450233941964451, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.45)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: forward, reward: -10.6299970584
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': -10.629997058364916, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent attempted driving forward through a red light. (rewarded -10.63)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: left, reward: -10.5492554797
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 14, 'action': 'left', 'reward': -10.549255479680648, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.55)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: None, reward: 1.72389951193
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 10, 't': 15, 'action': None, 'reward': 1.7238995119304474, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.72)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: left, reward: 1.41361007679
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 9, 't': 16, 'action': 'left', 'reward': 1.413610076789602, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.41)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: forward, reward: 0.932831370441
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 8, 't': 17, 'action': 'forward', 'reward': 0.9328313704407494, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 0.93)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: right, reward: 1.96991680215
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 7, 't': 18, 'action': 'right', 'reward': 1.9699168021468987, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.97)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: left, reward: -19.7512035108
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 6, 't': 19, 'action': 'left', 'reward': -19.7512035108168, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.75)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: left, reward: -19.9122895106
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 5, 't': 20, 'action': 'left', 'reward': -19.912289510618987, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.91)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: left, reward: -10.2989450483
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 4, 't': 21, 'action': 'left', 'reward': -10.298945048252257, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.30)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: left, reward: -10.8344002241
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', 'right'), 'deadline': 3, 't': 22, 'action': 'left', 'reward': -10.834400224057692, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'right')
Agent attempted driving left through a red light. (rewarded -10.83)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: right, reward: 1.21122092536
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'right'), 'deadline': 2, 't': 23, 'action': 'right', 'reward': 1.2112209253560504, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'right')
Agent followed the waypoint right. (rewarded 1.21)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: forward, reward: 1.18467631276
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 1, 't': 24, 'action': 'forward', 'reward': 1.1846763127644684, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.18)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 41
\-------------------------
Environment.reset(): Trial set up with start = (6, 4), destination = (7, 7), deadline = 20
Simulating trial. . .
epsilon = 0.6703; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: -5.03933194232
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 20, 't': 0, 'action': None, 'reward': -5.039331942320892, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.04)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: right, reward: 1.12891982884
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.128919828839472, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.13)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: left, reward: 1.89179598675
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': 1.8917959867499787, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 1.89)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: -4.13165972767
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': -4.131659727670227, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.13)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: forward, reward: 1.7149979337
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.7149979337046348, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 1.71)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: forward, reward: -10.3985204125
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': -10.398520412451063, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -10.40)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: right, reward: 2.63533443535
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 2.6353344353499057, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.64)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: right, reward: 2.34245236823
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 2.3424523682287193, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.34)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: forward, reward: 2.23408769439
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 2.2340876943935672, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.23)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: forward, reward: 1.13927164059
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 1.1392716405870644, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.14)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: right, reward: -0.175995915251
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': -0.17599591525085267, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded -0.18)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: None, reward: -5.13688692549
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 9, 't': 11, 'action': None, 'reward': -5.136886925489933, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.14)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: right, reward: 1.04081892789
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 1.0408189278861204, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.04)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: forward, reward: -9.80997195343
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': -9.809971953427487, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.81)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: forward, reward: -9.87713670625
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 6, 't': 14, 'action': 'forward', 'reward': -9.877136706251427, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent attempted driving forward through a red light. (rewarded -9.88)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: None, reward: -4.72295562209
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 5, 't': 15, 'action': None, 'reward': -4.722955622088075, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.72)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: None, reward: -4.03000539625
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 4, 't': 16, 'action': None, 'reward': -4.030005396254276, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -4.03)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: left, reward: -9.93757833315
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', 'right'), 'deadline': 3, 't': 17, 'action': 'left', 'reward': -9.937578333147117, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'right')
Agent attempted driving left through a red light. (rewarded -9.94)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: None, reward: -0.0512062586573
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 2, 't': 18, 'action': None, 'reward': -0.051206258657299886, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded -0.05)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: right, reward: 0.377229093882
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 1, 't': 19, 'action': 'right', 'reward': 0.3772290938817273, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 0.38)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 42
\-------------------------
Environment.reset(): Trial set up with start = (8, 5), destination = (3, 4), deadline = 20
Simulating trial. . .
epsilon = 0.6637; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: 1.41797135814
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.4179713581431421, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent followed the waypoint forward. (rewarded 1.42)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: left, reward: -9.5789451563
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', 'right'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': -9.578945156299119, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'right')
Agent attempted driving left through a red light. (rewarded -9.58)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: right, reward: 0.175576042928
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 0.1755760429275529, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent drove right instead of forward. (rewarded 0.18)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: left, reward: -10.6664307137
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': -10.666430713686188, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -10.67)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: left, reward: 2.74612774027
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.746127740268461, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.75)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: right, reward: -20.1944669209
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'right', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 15, 't': 5, 'action': 'right', 'reward': -20.19446692089137, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.19)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 2.22132289995
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.2213228999534764, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.22)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 2.77733978156
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 13, 't': 7, 'action': None, 'reward': 2.7773397815591663, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.78)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 1.32586792442
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.32586792441953, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.33)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: -4.92516089702
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 9, 'action': None, 'reward': -4.92516089702323, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.93)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: 1.90948104147
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 1.9094810414719254, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.91)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: left, reward: -39.9907105038
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 9, 't': 11, 'action': 'left', 'reward': -39.990710503754094, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.99)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 0.869764824454
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 8, 't': 12, 'action': None, 'reward': 0.8697648244540603, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.87)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 5), heading: (0, -1), action: left, reward: 0.935585000709
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 7, 't': 13, 'action': 'left', 'reward': 0.9355850007088848, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.94)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: left, reward: 1.25260349981
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 6, 't': 14, 'action': 'left', 'reward': 1.2526034998056699, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 1.25)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: left, reward: 0.986461782841
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 5, 't': 15, 'action': 'left', 'reward': 0.9864617828405711, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent drove left instead of right. (rewarded 0.99)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: None, reward: -5.64385852826
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 4, 't': 16, 'action': None, 'reward': -5.643858528258795, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.64)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: forward, reward: -10.5966629518
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 3, 't': 17, 'action': 'forward', 'reward': -10.596662951758683, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent attempted driving forward through a red light. (rewarded -10.60)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: None, reward: 1.97303726387
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 2, 't': 18, 'action': None, 'reward': 1.9730372638668456, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.97)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: None, reward: 0.654360984574
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 1, 't': 19, 'action': None, 'reward': 0.6543609845735385, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.65)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 43
\-------------------------
Environment.reset(): Trial set up with start = (8, 7), destination = (3, 6), deadline = 20
Simulating trial. . .
epsilon = 0.6570; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: right, reward: 1.30646307446
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.3064630744592314, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.31)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 5), heading: (0, -1), action: forward, reward: 0.740397109523
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 0.7403971095229308, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent drove forward instead of right. (rewarded 0.74)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 5), heading: (0, -1), action: None, reward: 0.851259529885
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 0.8512595298850839, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.85)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: right, reward: 2.84190610363
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 2.84190610363028, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.84)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: left, reward: -9.74964213621
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': -9.749642136209784, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.75)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: forward, reward: 1.36548969339
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.3654896933912044, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.37)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: 1.63764026787
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.637640267873398, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.64)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: right, reward: 0.186148746029
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 0.18614874602891374, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.19)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: left, reward: 1.90408601231
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 12, 't': 8, 'action': 'left', 'reward': 1.9040860123111796, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.90)
55% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 44
\-------------------------
Environment.reset(): Trial set up with start = (8, 6), destination = (4, 7), deadline = 25
Simulating trial. . .
epsilon = 0.6505; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: right, reward: 1.37741134257
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.377411342566589, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.38)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: 2.06557311809
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 2.065573118091102, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.07)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: right, reward: 1.41229233735
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.4122923373548475, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 1.41)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: left, reward: -10.0389201893
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 22, 't': 3, 'action': 'left', 'reward': -10.038920189257375, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.04)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: forward, reward: 1.09153729871
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.0915372987122116, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 1.09)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: right, reward: 1.82695771002
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 1.8269577100202898, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 1.83)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 0.0813204859605
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 0.08132048596050045, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 0.08)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: left, reward: 0.345125776535
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 0.3451257765350717, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 0.35)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: None, reward: 1.21783716177
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.2178371617720765, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.22)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: right, reward: 1.54234730309
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 1.5423473030946526, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.54)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: right, reward: -0.0798256690142
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': -0.07982566901420085, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded -0.08)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: right, reward: 1.26993769329
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 1.2699376932862272, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.27)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: 1.66784802491
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 12, 'action': None, 'reward': 1.667848024910658, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.67)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: -4.45328157796
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 13, 'action': None, 'reward': -4.453281577960948, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.45)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: right, reward: 0.183113037156
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 11, 't': 14, 'action': 'right', 'reward': 0.1831130371563373, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent drove right instead of forward. (rewarded 0.18)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: forward, reward: -10.0618094316
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 10, 't': 15, 'action': 'forward', 'reward': -10.061809431610012, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent attempted driving forward through a red light. (rewarded -10.06)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: None, reward: 0.999522681275
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 9, 't': 16, 'action': None, 'reward': 0.999522681275429, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.00)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: left, reward: 1.58151800453
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 8, 't': 17, 'action': 'left', 'reward': 1.5815180045280233, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.58)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 2.25115846115
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 7, 't': 18, 'action': None, 'reward': 2.2511584611549487, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.25)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: right, reward: -0.0156887278153
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 6, 't': 19, 'action': 'right', 'reward': -0.01568872781526298, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded -0.02)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: left, reward: 0.517753367974
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 5, 't': 20, 'action': 'left', 'reward': 0.5177533679736395, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.52)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: right, reward: -0.596170397867
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 4, 't': 21, 'action': 'right', 'reward': -0.5961703978665961, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove right instead of forward. (rewarded -0.60)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: forward, reward: 0.836343021741
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'forward'), 'deadline': 3, 't': 22, 'action': 'forward', 'reward': 0.8363430217413022, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'forward')
Agent drove forward instead of left. (rewarded 0.84)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: right, reward: 0.222985145557
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 2, 't': 23, 'action': 'right', 'reward': 0.22298514555731241, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.22)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: right, reward: 0.66088906344
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 1, 't': 24, 'action': 'right', 'reward': 0.6608890634399611, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 0.66)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 45
\-------------------------
Environment.reset(): Trial set up with start = (5, 4), destination = (1, 2), deadline = 30
Simulating trial. . .
epsilon = 0.6440; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: left, reward: -9.71205359574
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 30, 't': 0, 'action': 'left', 'reward': -9.712053595737231, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.71)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: right, reward: 1.31208510108
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 29, 't': 1, 'action': 'right', 'reward': 1.3120851010815118, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent drove right instead of left. (rewarded 1.31)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: None, reward: -4.24980766059
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 28, 't': 2, 'action': None, 'reward': -4.249807660587693, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.25)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: None, reward: -5.54382481352
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 27, 't': 3, 'action': None, 'reward': -5.543824813520841, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.54)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 3), heading: (0, -1), action: right, reward: 1.86558588105
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 26, 't': 4, 'action': 'right', 'reward': 1.865585881052601, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove right instead of forward. (rewarded 1.87)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: left, reward: 2.86049197974
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'left', 'reward': 2.8604919797389092, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.86)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: 2.16726495493
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.167264954934371, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.17)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: forward, reward: 1.11235150846
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 1.1123515084643811, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.11)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: right, reward: 0.605462240591
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 22, 't': 8, 'action': 'right', 'reward': 0.6054622405910018, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent drove right instead of forward. (rewarded 0.61)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: None, reward: 1.56491412913
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 21, 't': 9, 'action': None, 'reward': 1.5649141291324544, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.56)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: left, reward: -10.3311194738
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 10, 'action': 'left', 'reward': -10.331119473846222, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.33)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: right, reward: 1.629076428
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 1.629076428004542, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove right instead of left. (rewarded 1.63)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 0.1396961959
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 18, 't': 12, 'action': None, 'reward': 0.1396961958995221, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 0.14)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: right, reward: 1.96440371025
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 17, 't': 13, 'action': 'right', 'reward': 1.9644037102474223, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.96)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: None, reward: -4.39257256906
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 16, 't': 14, 'action': None, 'reward': -4.392572569060512, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -4.39)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: None, reward: 0.302514401732
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 15, 't': 15, 'action': None, 'reward': 0.30251440173155253, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.30)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: forward, reward: -10.4457395059
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 14, 't': 16, 'action': 'forward', 'reward': -10.445739505895261, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.45)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: forward, reward: 0.253319858338
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 13, 't': 17, 'action': 'forward', 'reward': 0.2533198583377987, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 0.25)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: None, reward: -0.275385077063
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'forward'), 'deadline': 12, 't': 18, 'action': None, 'reward': -0.27538507706329085, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded -0.28)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: right, reward: 2.27063530331
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 11, 't': 19, 'action': 'right', 'reward': 2.270635303310259, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 2.27)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: forward, reward: 1.64460580613
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 10, 't': 20, 'action': 'forward', 'reward': 1.6446058061322388, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.64)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: forward, reward: 0.233802909532
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 9, 't': 21, 'action': 'forward', 'reward': 0.23380290953172034, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 0.23)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: right, reward: 2.37265327446
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 8, 't': 22, 'action': 'right', 'reward': 2.3726532744611672, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 2.37)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: left, reward: -10.7737926462
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 7, 't': 23, 'action': 'left', 'reward': -10.773792646245544, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.77)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: right, reward: 1.46965500261
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 6, 't': 24, 'action': 'right', 'reward': 1.4696550026119692, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.47)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 2.18028647958
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 5, 't': 25, 'action': None, 'reward': 2.1802864795844545, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.18)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 0.519765595342
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 4, 't': 26, 'action': None, 'reward': 0.5197655953422844, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.52)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: right, reward: 0.429100597519
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 3, 't': 27, 'action': 'right', 'reward': 0.42910059751894036, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove right instead of left. (rewarded 0.43)
7% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: None, reward: -5.03870936845
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 2, 't': 28, 'action': None, 'reward': -5.038709368450959, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.04)
3% of time remaining to reach destination.
/-------------------
| Step 29 Results
\-------------------
Environment.step(): t = 29
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: forward, reward: -40.1459103442
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 1, 't': 29, 'action': 'forward', 'reward': -40.14591034415018, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'red', None, None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.15)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 46
\-------------------------
Environment.reset(): Trial set up with start = (5, 5), destination = (2, 6), deadline = 20
Simulating trial. . .
epsilon = 0.6376; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: right, reward: 1.30582847451
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.3058284745121456, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 1.31)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: right, reward: 0.498409820666
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 0.498409820666078, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 0.50)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: left, reward: 1.10797407622
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': 1.1079740762174035, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.11)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: forward, reward: 1.33559783666
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.335597836655446, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.34)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 2.90243778837
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.902437788369575, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.90)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 1.70656853788
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.7065685378827866, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.71)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: right, reward: 1.18732125198
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.1873212519801941, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 1.19)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: right, reward: -20.4005874484
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': -20.400587448374928, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.40)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: right, reward: 0.866306148216
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 0.8663061482163482, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent drove right instead of left. (rewarded 0.87)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: right, reward: 0.848355536917
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 0.8483555369169264, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 0.85)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: right, reward: 2.45156819178
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 2.4515681917811314, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.45)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 2.11078181606
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 9, 't': 11, 'action': None, 'reward': 2.11078181606453, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.11)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 1.86611829897
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 8, 't': 12, 'action': None, 'reward': 1.8661182989744023, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.87)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: forward, reward: 1.63224072556
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 1.6322407255589706, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 1.63)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 1.19559229968
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.1955922996844184, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.20)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 2.16297872225
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 5, 't': 15, 'action': None, 'reward': 2.162978722250866, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.16)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: 1.94205609281
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 4, 't': 16, 'action': 'forward', 'reward': 1.9420560928141712, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.94)
15% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 47
\-------------------------
Environment.reset(): Trial set up with start = (2, 3), destination = (4, 7), deadline = 20
Simulating trial. . .
epsilon = 0.6313; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.29050342027
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.290503420272654, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.29)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 0.626180690006
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 0.6261806900056505, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.63)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: None, reward: 2.02038654403
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.0203865440263797, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.02)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: None, reward: 1.64193145969
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.6419314596922359, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.64)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: forward, reward: 0.614379930863
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 0.6143799308625523, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove forward instead of left. (rewarded 0.61)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: forward, reward: 0.533994520637
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 0.5339945206373304, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove forward instead of left. (rewarded 0.53)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: left, reward: -9.88796979056
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 6, 'action': 'left', 'reward': -9.887969790563051, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.89)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: forward, reward: -10.4360728942
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': -10.436072894187342, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.44)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: left, reward: -40.9173370553
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 12, 't': 8, 'action': 'left', 'reward': -40.91733705531302, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.92)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: forward, reward: 0.447781136467
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 0.4477811364674187, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 0.45)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: None, reward: -4.62731774702
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 10, 't': 10, 'action': None, 'reward': -4.627317747015148, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.63)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: None, reward: -5.74560674759
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 9, 't': 11, 'action': None, 'reward': -5.74560674758824, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.75)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: right, reward: 0.501788808098
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 0.5017888080983841, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 0.50)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: left, reward: -39.1504482987
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 7, 't': 13, 'action': 'left', 'reward': -39.15044829871806, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.15)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: right, reward: 1.8026425164
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'left'), 'deadline': 6, 't': 14, 'action': 'right', 'reward': 1.8026425163950437, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'left')
Agent followed the waypoint right. (rewarded 1.80)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: right, reward: 2.0190137151
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 5, 't': 15, 'action': 'right', 'reward': 2.019013715101362, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.02)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: left, reward: -0.0519286078428
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 4, 't': 16, 'action': 'left', 'reward': -0.051928607842774976, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove left instead of forward. (rewarded -0.05)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: right, reward: 1.58749659308
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 3, 't': 17, 'action': 'right', 'reward': 1.5874965930790785, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.59)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: 1.59091127107
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 2, 't': 18, 'action': 'forward', 'reward': 1.5909112710710045, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.59)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: right, reward: 1.92212561964
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 1, 't': 19, 'action': 'right', 'reward': 1.9221256196446517, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.92)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 48
\-------------------------
Environment.reset(): Trial set up with start = (3, 7), destination = (5, 5), deadline = 20
Simulating trial. . .
epsilon = 0.6250; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: right, reward: 2.52862907902
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.528629079015401, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.53)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: right, reward: 1.17414702288
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.17414702287715, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent drove right instead of forward. (rewarded 1.17)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: forward, reward: -10.3135034046
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': -10.313503404592755, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -10.31)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: right, reward: 1.58618391486
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.5861839148592591, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.59)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: forward, reward: 0.239789141521
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'forward'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 0.23978914152073127, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'forward')
Agent drove forward instead of right. (rewarded 0.24)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: right, reward: 2.58086405009
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 2.580864050087273, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.58)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: right, reward: 1.1540341369
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.1540341368962832, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.15)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: right, reward: 1.36919015037
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'right'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 1.36919015036718, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'right')
Agent drove right instead of forward. (rewarded 1.37)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: left, reward: 1.28448788408
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 12, 't': 8, 'action': 'left', 'reward': 1.2844878840823675, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.28)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: 1.00341106758
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 1.003411067582312, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.00)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: left, reward: -10.8363529533
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 10, 't': 10, 'action': 'left', 'reward': -10.836352953276373, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -10.84)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: -39.6063309558
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': -39.60633095579773, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.61)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: 0.702656382456
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 8, 't': 12, 'action': None, 'reward': 0.7026563824563989, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.70)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: forward, reward: 1.21556248092
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'right'), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 1.2155624809157235, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'right')
Agent drove forward instead of left. (rewarded 1.22)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: None, reward: 1.38216423943
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.3821642394266602, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.38)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: left, reward: 2.23405070172
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 5, 't': 15, 'action': 'left', 'reward': 2.234050701719327, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.23)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: None, reward: 1.11462264783
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 4, 't': 16, 'action': None, 'reward': 1.1146226478312415, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.11)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: forward, reward: -0.218377652458
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 3, 't': 17, 'action': 'forward', 'reward': -0.21837765245814855, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded -0.22)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: left, reward: 2.13475625104
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 2, 't': 18, 'action': 'left', 'reward': 2.1347562510403995, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.13)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: right, reward: 2.10677414093
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 1, 't': 19, 'action': 'right', 'reward': 2.106774140925041, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.11)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 49
\-------------------------
Environment.reset(): Trial set up with start = (4, 6), destination = (8, 3), deadline = 35
Simulating trial. . .
epsilon = 0.6188; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: forward, reward: -9.35921391529
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 35, 't': 0, 'action': 'forward', 'reward': -9.359213915287567, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.36)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: left, reward: -40.9475810436
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 34, 't': 1, 'action': 'left', 'reward': -40.94758104359118, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.95)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: 1.52719047161
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 33, 't': 2, 'action': None, 'reward': 1.527190471612654, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.53)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: 2.36638634516
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 32, 't': 3, 'action': None, 'reward': 2.3663863451634937, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.37)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: left, reward: 2.21990588728
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 31, 't': 4, 'action': 'left', 'reward': 2.219905887275612, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.22)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: right, reward: -20.4834515058
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 30, 't': 5, 'action': 'right', 'reward': -20.483451505786206, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.48)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 5), heading: (0, -1), action: right, reward: 1.52122762555
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 29, 't': 6, 'action': 'right', 'reward': 1.5212276255464376, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 1.52)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: right, reward: 1.29577016527
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 28, 't': 7, 'action': 'right', 'reward': 1.2957701652726086, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove right instead of left. (rewarded 1.30)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: -4.92425903811
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 27, 't': 8, 'action': None, 'reward': -4.924259038110881, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.92)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 2.62293125568
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 26, 't': 9, 'action': None, 'reward': 2.622931255675576, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.62)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 1.34518377202
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 25, 't': 10, 'action': None, 'reward': 1.3451837720189286, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.35)
69% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 2.34750589184
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 24, 't': 11, 'action': None, 'reward': 2.3475058918394014, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.35)
66% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: left, reward: -9.93512404177
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 12, 'action': 'left', 'reward': -9.935124041770926, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.94)
63% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 4), heading: (0, -1), action: left, reward: 2.71209022523
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 22, 't': 13, 'action': 'left', 'reward': 2.71209022522982, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 2.71)
60% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: right, reward: 1.49053990989
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 21, 't': 14, 'action': 'right', 'reward': 1.4905399098855256, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent drove right instead of left. (rewarded 1.49)
57% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: right, reward: 0.563274764911
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 15, 'action': 'right', 'reward': 0.5632747649106323, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.56)
54% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: left, reward: 1.84378111251
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 19, 't': 16, 'action': 'left', 'reward': 1.8437811125147625, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.84)
51% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: left, reward: -10.8824163224
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 17, 'action': 'left', 'reward': -10.882416322391709, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.88)
49% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: left, reward: -9.96352184082
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 18, 'action': 'left', 'reward': -9.963521840817164, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.96)
46% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: forward, reward: 2.46501800426
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 16, 't': 19, 'action': 'forward', 'reward': 2.465018004255861, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 2.47)
43% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: right, reward: -0.104702064545
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 15, 't': 20, 'action': 'right', 'reward': -0.10470206454498365, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent drove right instead of forward. (rewarded -0.10)
40% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: forward, reward: 1.27202823024
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 14, 't': 21, 'action': 'forward', 'reward': 1.2720282302369008, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove forward instead of left. (rewarded 1.27)
37% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: left, reward: -20.122074872
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 13, 't': 22, 'action': 'left', 'reward': -20.122074872006284, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.12)
34% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: left, reward: 1.77755492897
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 12, 't': 23, 'action': 'left', 'reward': 1.7775549289706356, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.78)
31% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: right, reward: 1.86389206123
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 11, 't': 24, 'action': 'right', 'reward': 1.8638920612300611, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent followed the waypoint right. (rewarded 1.86)
29% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: forward, reward: -39.7466894379
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 10, 't': 25, 'action': 'forward', 'reward': -39.746689437855615, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.75)
26% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: None, reward: 1.99444938636
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 9, 't': 26, 'action': None, 'reward': 1.99444938635785, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.99)
23% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: None, reward: 2.20673098042
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 8, 't': 27, 'action': None, 'reward': 2.2067309804227664, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.21)
20% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: forward, reward: 2.00764691101
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 7, 't': 28, 'action': 'forward', 'reward': 2.007646911007005, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.01)
17% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 50
\-------------------------
Environment.reset(): Trial set up with start = (1, 7), destination = (6, 5), deadline = 25
Simulating trial. . .
epsilon = 0.6126; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: forward, reward: -10.7898025076
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': -10.789802507632885, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent attempted driving forward through a red light. (rewarded -10.79)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: left, reward: -10.387154088
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 24, 't': 1, 'action': 'left', 'reward': -10.387154087989181, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent attempted driving left through a red light. (rewarded -10.39)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: 2.0435469188
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.043546918795786, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.04)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: forward, reward: 2.36831135117
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.3683113511699974, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.37)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 2.13603707167
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.136037071674841, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.14)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: forward, reward: -10.2487156833
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': -10.248715683285262, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent attempted driving forward through a red light. (rewarded -10.25)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 1.28496020239
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.2849602023863462, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.28)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 1.05421753011
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.0542175301050896, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.05)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.61252668746
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.6125266874649717, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.61)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: -40.9521248785
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': -40.95212487848555, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.95)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: right, reward: 1.09744977244
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 1.0974497724432135, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent drove right instead of forward. (rewarded 1.10)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: left, reward: -9.74431359165
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 11, 'action': 'left', 'reward': -9.74431359164636, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.74)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: left, reward: -9.04237747224
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 13, 't': 12, 'action': 'left', 'reward': -9.04237747223669, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -9.04)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: right, reward: 1.6488231807
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 12, 't': 13, 'action': 'right', 'reward': 1.6488231807042455, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent drove right instead of left. (rewarded 1.65)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 5), heading: (0, -1), action: left, reward: 0.966903692079
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 11, 't': 14, 'action': 'left', 'reward': 0.9669036920792236, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.97)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: left, reward: 0.881058499472
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 10, 't': 15, 'action': 'left', 'reward': 0.8810584994719104, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 0.88)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 4), heading: (0, -1), action: right, reward: 0.368591691968
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 9, 't': 16, 'action': 'right', 'reward': 0.36859169196769626, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent drove right instead of forward. (rewarded 0.37)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: left, reward: 1.66918320477
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 8, 't': 17, 'action': 'left', 'reward': 1.669183204771334, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 1.67)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 5), heading: (0, 1), action: left, reward: 1.89379131653
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 7, 't': 18, 'action': 'left', 'reward': 1.8937913165258378, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.89)
24% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 51
\-------------------------
Environment.reset(): Trial set up with start = (3, 6), destination = (8, 2), deadline = 25
Simulating trial. . .
epsilon = 0.6065; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: None, reward: 0.460860264369
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'forward'), 'deadline': 25, 't': 0, 'action': None, 'reward': 0.4608602643693689, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 0.46)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: None, reward: 0.0757195098581
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 0.07571950985809184, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.08)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: right, reward: 2.2701811791
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 2.270181179101078, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.27)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: forward, reward: 1.32345880959
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.3234588095884048, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.32)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: -5.80082216468
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': None, 'reward': -5.800822164675505, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.80)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: right, reward: 0.326020886609
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 0.3260208866086908, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.33)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: -4.8820401491
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 19, 't': 6, 'action': None, 'reward': -4.8820401490960625, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.88)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 1.39572780245
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 7, 'action': None, 'reward': 1.395727802451755, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.40)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: forward, reward: -10.6223589556
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': -10.622358955606677, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.62)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: forward, reward: 0.628621438099
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 0.6286214380990065, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 0.63)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: left, reward: -19.9205151476
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 15, 't': 10, 'action': 'left', 'reward': -19.92051514756151, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.92)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: None, reward: 1.27710596413
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 11, 'action': None, 'reward': 1.2771059641346576, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.28)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: forward, reward: -10.0453855848
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 13, 't': 12, 'action': 'forward', 'reward': -10.045385584776325, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.05)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: left, reward: 2.30771492485
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'left', 'reward': 2.307714924850016, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.31)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: right, reward: 2.40060315444
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 11, 't': 14, 'action': 'right', 'reward': 2.4006031544363315, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.40)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: None, reward: 1.23732535065
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 10, 't': 15, 'action': None, 'reward': 1.237325350646613, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.24)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: right, reward: -20.703719299
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 9, 't': 16, 'action': 'right', 'reward': -20.70371929897543, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.70)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: None, reward: -4.80263256487
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 8, 't': 17, 'action': None, 'reward': -4.802632564869118, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -4.80)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: forward, reward: 1.33616374556
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 7, 't': 18, 'action': 'forward', 'reward': 1.33616374556085, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.34)
24% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 52
\-------------------------
Environment.reset(): Trial set up with start = (1, 7), destination = (2, 4), deadline = 20
Simulating trial. . .
epsilon = 0.6005; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: left, reward: -19.7651192763
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 20, 't': 0, 'action': 'left', 'reward': -19.765119276309946, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.77)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: left, reward: 2.83297047618
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': 2.8329704761782173, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 2.83)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: right, reward: 1.00322831249
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.0032283124929795, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.00)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: left, reward: 0.907109639717
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 0.9071096397171073, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded 0.91)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 0.791140869607
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 0.7911408696068302, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.79)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: forward, reward: 1.02460819659
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.024608196587867, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 1.02)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: left, reward: -10.0369340973
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 14, 't': 6, 'action': 'left', 'reward': -10.036934097313521, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent attempted driving left through a red light. (rewarded -10.04)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: 0.0512112679882
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 0.051211267988187315, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent drove forward instead of right. (rewarded 0.05)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: -10.5103932662
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': -10.510393266174678, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.51)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: right, reward: 1.27787960898
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 1.277879608976056, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.28)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: -4.15522057574
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 10, 't': 10, 'action': None, 'reward': -4.155220575736088, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.16)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 0.349152463607
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 9, 't': 11, 'action': None, 'reward': 0.3491524636070471, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 0.35)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: right, reward: 2.53186636097
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 2.5318663609701924, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.53)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: right, reward: 0.743258589485
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 7, 't': 13, 'action': 'right', 'reward': 0.7432585894851605, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent drove right instead of forward. (rewarded 0.74)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: right, reward: 1.44027530163
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 6, 't': 14, 'action': 'right', 'reward': 1.4402753016271945, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.44)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: right, reward: 1.78259110879
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 5, 't': 15, 'action': 'right', 'reward': 1.7825911087874426, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.78)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: right, reward: 1.17392642569
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 4, 't': 16, 'action': 'right', 'reward': 1.1739264256916995, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.17)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: forward, reward: 1.33916781461
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 3, 't': 17, 'action': 'forward', 'reward': 1.3391678146105896, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.34)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: right, reward: -0.726681969808
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 2, 't': 18, 'action': 'right', 'reward': -0.7266819698079807, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded -0.73)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: left, reward: 0.708990871282
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 1, 't': 19, 'action': 'left', 'reward': 0.7089908712819701, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.71)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 53
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (7, 2), deadline = 20
Simulating trial. . .
epsilon = 0.5945; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: left, reward: 2.54899878783
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 2.548998787826208, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.55)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: forward, reward: 1.94520986736
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 1.9452098673604494, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.95)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: forward, reward: -9.0782691041
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': -9.078269104103155, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -9.08)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 2.94823143086
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.9482314308569904, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.95)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 1.97637527577
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.9763752757720952, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.98)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: left, reward: -40.0080619862
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 15, 't': 5, 'action': 'left', 'reward': -40.00806198623927, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.01)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: right, reward: 1.1066760758
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.1066760758030734, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 1.11)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: forward, reward: -40.3006284401
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': -40.30062844013268, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.30)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: right, reward: 2.53979933136
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'left'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 2.5397993313615896, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'left')
Agent followed the waypoint right. (rewarded 2.54)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: right, reward: 1.04485964124
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 1.0448596412434477, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.04)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: forward, reward: 1.49368967235
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 1.4936896723499387, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove forward instead of right. (rewarded 1.49)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: None, reward: -4.18615567521
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 9, 't': 11, 'action': None, 'reward': -4.186155675213286, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -4.19)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: right, reward: 1.58383959178
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'left'), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 1.5838395917759018, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'left')
Agent followed the waypoint right. (rewarded 1.58)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.39994296672
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 7, 't': 13, 'action': None, 'reward': 2.3999429667197996, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.40)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: -10.6089215566
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 6, 't': 14, 'action': 'forward', 'reward': -10.60892155662757, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.61)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.04402517345
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 5, 't': 15, 'action': None, 'reward': 1.0440251734527135, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.04)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: right, reward: -0.25647542008
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 4, 't': 16, 'action': 'right', 'reward': -0.2564754200802526, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove right instead of left. (rewarded -0.26)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: right, reward: 0.46251907555
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 3, 't': 17, 'action': 'right', 'reward': 0.462519075549642, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 0.46)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: -0.345654845647
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 2, 't': 18, 'action': None, 'reward': -0.3456548456472992, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent properly idled at a red light. (rewarded -0.35)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: right, reward: 1.2279895639
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 1, 't': 19, 'action': 'right', 'reward': 1.2279895639032594, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.23)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 54
\-------------------------
Environment.reset(): Trial set up with start = (3, 3), destination = (6, 6), deadline = 30
Simulating trial. . .
epsilon = 0.5886; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: left, reward: 1.03869861528
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 30, 't': 0, 'action': 'left', 'reward': 1.0386986152823234, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent drove left instead of right. (rewarded 1.04)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: right, reward: 0.476911036901
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 29, 't': 1, 'action': 'right', 'reward': 0.4769110369014643, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove right instead of forward. (rewarded 0.48)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: None, reward: 1.64198318254
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.641983182538062, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.64)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: None, reward: -4.49491808059
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': -4.494918080593627, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.49)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: forward, reward: 0.614174170066
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 0.6141741700658493, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove forward instead of left. (rewarded 0.61)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: None, reward: 2.81702530948
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 25, 't': 5, 'action': None, 'reward': 2.817025309483787, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.82)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: forward, reward: -9.16124247793
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': -9.161242477930667, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.16)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: right, reward: 0.707154617958
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'right', 'reward': 0.7071546179575646, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.71)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: left, reward: 1.04026228815
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 22, 't': 8, 'action': 'left', 'reward': 1.0402622881468107, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent drove left instead of forward. (rewarded 1.04)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: right, reward: 2.43621108451
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'right'), 'deadline': 21, 't': 9, 'action': 'right', 'reward': 2.4362110845123484, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'right')
Agent followed the waypoint right. (rewarded 2.44)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: forward, reward: 2.36595195054
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 2.3659519505404596, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.37)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: forward, reward: 1.22388680192
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 19, 't': 11, 'action': 'forward', 'reward': 1.2238868019152238, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.22)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 55
\-------------------------
Environment.reset(): Trial set up with start = (2, 5), destination = (5, 3), deadline = 25
Simulating trial. . .
epsilon = 0.5827; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: -4.85491406071
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 25, 't': 0, 'action': None, 'reward': -4.854914060714166, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -4.85)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: forward, reward: 2.25466262583
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 2.25466262582598, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.25)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 1.96976570382
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.9697657038249694, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.97)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: forward, reward: -9.10643838501
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': -9.10643838501434, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent attempted driving forward through a red light. (rewarded -9.11)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: 1.24959158693
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.249591586931942, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.25)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: left, reward: -9.51934396866
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 20, 't': 5, 'action': 'left', 'reward': -9.519343968663577, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -9.52)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: right, reward: 0.431706762357
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 0.43170676235701977, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent drove right instead of forward. (rewarded 0.43)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: forward, reward: -10.1988778902
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': -10.198877890198116, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -10.20)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: None, reward: 1.65877118945
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.6587711894515376, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 1.66)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: None, reward: 1.02254797949
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.0225479794885621, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.02)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: right, reward: 1.73745907688
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 1.7374590768752345, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.74)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: forward, reward: -10.9693403976
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': -10.969340397608775, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.97)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: forward, reward: -39.1284718574
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'right', 'forward'), 'deadline': 13, 't': 12, 'action': 'forward', 'reward': -39.128471857360125, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.13)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: left, reward: 2.0036397276
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 12, 't': 13, 'action': 'left', 'reward': 2.003639727604794, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.00)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: left, reward: -10.807896087
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 11, 't': 14, 'action': 'left', 'reward': -10.807896087014376, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.81)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: None, reward: -4.51730041816
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 10, 't': 15, 'action': None, 'reward': -4.517300418155316, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.52)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: left, reward: 2.13565110403
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 9, 't': 16, 'action': 'left', 'reward': 2.135651104033678, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.14)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: forward, reward: 1.69133311751
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 8, 't': 17, 'action': 'forward', 'reward': 1.6913331175070965, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.69)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: right, reward: 2.20805832992
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 7, 't': 18, 'action': 'right', 'reward': 2.2080583299161214, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.21)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: None, reward: 2.28854954897
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 6, 't': 19, 'action': None, 'reward': 2.288549548971206, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.29)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: left, reward: -9.18563874728
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 5, 't': 20, 'action': 'left', 'reward': -9.185638747276947, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.19)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: None, reward: 0.940780593943
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 4, 't': 21, 'action': None, 'reward': 0.940780593943142, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 0.94)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: forward, reward: 0.4657113626
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 3, 't': 22, 'action': 'forward', 'reward': 0.4657113626000766, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.47)
8% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 56
\-------------------------
Environment.reset(): Trial set up with start = (5, 7), destination = (7, 5), deadline = 20
Simulating trial. . .
epsilon = 0.5769; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: left, reward: -39.0103910319
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': -39.010391031902714, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.01)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: left, reward: -9.95658999538
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': -9.956589995383982, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent attempted driving left through a red light. (rewarded -9.96)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: 2.07216928651
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.0721692865078367, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.07)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: 2.21579220905
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.215792209049008, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.22)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: forward, reward: -9.46837412982
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': -9.468374129817242, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -9.47)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: forward, reward: 1.51725001797
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.517250017973989, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.52)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: forward, reward: 1.11818658219
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.1181865821889871, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.12)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: right, reward: 0.274822452514
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 0.27482245251433435, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 0.27)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: right, reward: 0.860970924855
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 0.8609709248552022, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent followed the waypoint right. (rewarded 0.86)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: None, reward: -5.41450909921
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 11, 't': 9, 'action': None, 'reward': -5.414509099208541, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.41)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: right, reward: 0.938387103716
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 0.9383871037155496, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 0.94)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: right, reward: 2.09568710973
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 2.095687109734093, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.10)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: forward, reward: 1.56603739507
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': 1.5660373950745816, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 1.57)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: -5.84564230027
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 7, 't': 13, 'action': None, 'reward': -5.845642300268909, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.85)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: right, reward: 0.992506989972
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 6, 't': 14, 'action': 'right', 'reward': 0.9925069899723603, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.99)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: forward, reward: -0.0797845280926
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': -0.07978452809257819, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent drove forward instead of right. (rewarded -0.08)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: forward, reward: -0.521100512765
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 4, 't': 16, 'action': 'forward', 'reward': -0.5211005127645281, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent drove forward instead of right. (rewarded -0.52)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: right, reward: 1.65616714939
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 3, 't': 17, 'action': 'right', 'reward': 1.6561671493938754, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.66)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 0.453365149335
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 2, 't': 18, 'action': None, 'reward': 0.45336514933542715, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.45)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: right, reward: -0.562753316277
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 1, 't': 19, 'action': 'right', 'reward': -0.5627533162769093, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded -0.56)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 57
\-------------------------
Environment.reset(): Trial set up with start = (6, 4), destination = (8, 2), deadline = 20
Simulating trial. . .
epsilon = 0.5712; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: right, reward: 1.75789245649
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'right'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.757892456494643, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'right')
Agent drove right instead of left. (rewarded 1.76)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: forward, reward: 0.147970444113
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 0.14797044411269822, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove forward instead of right. (rewarded 0.15)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: forward, reward: 2.33368825629
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 2.333688256289138, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.33)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 1.97937904918
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.9793790491810148, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.98)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 2.87669190525
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.8766919052499196, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.88)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: -4.34312429227
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 5, 'action': None, 'reward': -4.3431242922693025, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.34)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: forward, reward: 1.88106051656
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.8810605165550787, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.88)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: None, reward: 1.44829726986
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.4482972698641492, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.45)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: None, reward: 1.54935159903
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.5493515990272366, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.55)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: left, reward: -9.87039884053
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 11, 't': 9, 'action': 'left', 'reward': -9.87039884052575, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -9.87)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: forward, reward: 1.99405688877
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 1.9940568887667967, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.99)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: forward, reward: 2.09333753543
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 2.093337535429354, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.09)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: forward, reward: 0.150366362077
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': 0.15036636207707277, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent drove forward instead of right. (rewarded 0.15)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (7, 5), heading: (0, 1), action: left, reward: 0.611494866478
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 7, 't': 13, 'action': 'left', 'reward': 0.6114948664775205, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 0.61)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: left, reward: 2.31762444924
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 6, 't': 14, 'action': 'left', 'reward': 2.3176244492408484, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.32)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: right, reward: 2.07058575572
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 5, 't': 15, 'action': 'right', 'reward': 2.0705857557231835, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.07)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: forward, reward: 1.79246971213
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 4, 't': 16, 'action': 'forward', 'reward': 1.7924697121305002, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.79)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: forward, reward: 0.700770077488
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 3, 't': 17, 'action': 'forward', 'reward': 0.700770077488124, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 0.70)
10% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 58
\-------------------------
Environment.reset(): Trial set up with start = (1, 4), destination = (4, 5), deadline = 20
Simulating trial. . .
epsilon = 0.5655; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: forward, reward: 0.140220983482
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 0.1402209834816639, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove forward instead of left. (rewarded 0.14)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: left, reward: 2.20857184918
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': 2.2085718491810753, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.21)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: left, reward: -20.1576218413
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': -20.157621841257157, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.16)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: left, reward: 0.541240591292
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 0.5412405912920343, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 0.54)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: forward, reward: 0.451970903262
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 0.45197090326193223, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove forward instead of right. (rewarded 0.45)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: forward, reward: -10.2371063079
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': -10.23710630788604, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -10.24)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: right, reward: 1.86362798189
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.8636279818909025, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.86)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 1.67517779515
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.6751777951505575, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.68)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 1.23586584076
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.2358658407598009, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.24)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: -4.11224959284
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 11, 't': 9, 'action': None, 'reward': -4.112249592840818, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.11)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: 1.76665614321
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 1.7666561432142784, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.77)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: right, reward: 2.56604903329
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 2.566049033285658, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.57)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: left, reward: 0.837961190921
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 8, 't': 12, 'action': 'left', 'reward': 0.8379611909209168, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove left instead of forward. (rewarded 0.84)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: left, reward: -9.93636501089
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 7, 't': 13, 'action': 'left', 'reward': -9.936365010889665, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.94)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: None, reward: 0.957382494246
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 6, 't': 14, 'action': None, 'reward': 0.95738249424619, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.96)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: left, reward: -10.1397171489
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 5, 't': 15, 'action': 'left', 'reward': -10.13971714891976, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.14)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: right, reward: 0.972716786037
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 4, 't': 16, 'action': 'right', 'reward': 0.9727167860370205, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 0.97)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: right, reward: 1.28214658782
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 3, 't': 17, 'action': 'right', 'reward': 1.2821465878231193, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.28)
10% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 59
\-------------------------
Environment.reset(): Trial set up with start = (7, 5), destination = (3, 6), deadline = 25
Simulating trial. . .
epsilon = 0.5599; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 4), heading: (0, -1), action: forward, reward: 1.41307789573
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'right'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 1.413077895726187, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'right')
Agent drove forward instead of right. (rewarded 1.41)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: right, reward: 1.22320005525
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.2232000552475628, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.22)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 1.78512037101
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.7851203710118186, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.79)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 2.68287414723
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.6828741472340782, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.68)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: forward, reward: 1.89183788519
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.891837885188299, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.89)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: None, reward: 2.66526093722
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.665260937220537, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.67)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: left, reward: -9.35979335774
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 19, 't': 6, 'action': 'left', 'reward': -9.359793357742616, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.36)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: forward, reward: 1.37629554457
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.3762955445718255, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.38)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 2.04469329301
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.0446932930092623, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.04)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 2.42576988045
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 16, 't': 9, 'action': None, 'reward': 2.425769880448075, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.43)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: forward, reward: 1.97285929233
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 1.972859292327113, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.97)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: right, reward: 1.23236772046
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 1.2323677204636907, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.23)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: right, reward: 1.22319963967
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 1.2231996396704734, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove right instead of forward. (rewarded 1.22)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 2.18950919411
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 12, 't': 13, 'action': None, 'reward': 2.18950919410846, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.19)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 2.12526815957
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 11, 't': 14, 'action': None, 'reward': 2.1252681595703846, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.13)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: forward, reward: -9.81529264467
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 10, 't': 15, 'action': 'forward', 'reward': -9.815292644674079, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.82)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: -5.59317538235
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 9, 't': 16, 'action': None, 'reward': -5.593175382354776, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.59)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: -4.4309575393
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 8, 't': 17, 'action': None, 'reward': -4.430957539303082, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.43)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: forward, reward: 0.610442107674
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'left'), 'deadline': 7, 't': 18, 'action': 'forward', 'reward': 0.6104421076743121, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'left')
Agent drove forward instead of left. (rewarded 0.61)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: left, reward: 1.72629677336
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 6, 't': 19, 'action': 'left', 'reward': 1.7262967733594523, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 1.73)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: forward, reward: 0.730099429126
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 5, 't': 20, 'action': 'forward', 'reward': 0.7300994291262084, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove forward instead of left. (rewarded 0.73)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: forward, reward: -9.18698944263
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 4, 't': 21, 'action': 'forward', 'reward': -9.18698944263189, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.19)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: 2.18385981929
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 3, 't': 22, 'action': None, 'reward': 2.1838598192893603, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.18)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 0.791232010507
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 2, 't': 23, 'action': 'right', 'reward': 0.7912320105074664, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.79)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: -0.630678003555
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 1, 't': 24, 'action': None, 'reward': -0.6306780035546309, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'red', 'left', None)
Agent properly idled at a red light. (rewarded -0.63)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 60
\-------------------------
Environment.reset(): Trial set up with start = (5, 4), destination = (4, 7), deadline = 20
Simulating trial. . .
epsilon = 0.5543; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: right, reward: 1.85053968207
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.8505396820684956, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 1.85)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: left, reward: 2.12760814322
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': 2.127608143223278, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.13)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: right, reward: 2.03004105701
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.030041057007472, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.03)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: forward, reward: 2.69190081383
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.69190081382599, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.69)
80% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 61
\-------------------------
Environment.reset(): Trial set up with start = (2, 6), destination = (5, 7), deadline = 20
Simulating trial. . .
epsilon = 0.5488; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: right, reward: 2.18360297389
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.1836029738858524, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.18)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: right, reward: 0.135397620269
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 0.13539762026876712, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent drove right instead of forward. (rewarded 0.14)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: right, reward: 0.653565851075
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 0.6535658510753423, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove right instead of left. (rewarded 0.65)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: None, reward: -4.52032974522
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 17, 't': 3, 'action': None, 'reward': -4.520329745218211, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -4.52)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: forward, reward: -10.3451025646
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': -10.345102564551341, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent attempted driving forward through a red light. (rewarded -10.35)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: right, reward: 2.67652734153
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 2.676527341527402, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.68)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: right, reward: 2.57641694011
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 2.576416940110714, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.58)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: right, reward: 1.28013898635
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'right'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 1.280138986350696, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'right')
Agent drove right instead of forward. (rewarded 1.28)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: None, reward: 2.560797245
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.5607972450024166, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.56)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: left, reward: 2.00239037074
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 11, 't': 9, 'action': 'left', 'reward': 2.002390370741371, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 2.00)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: forward, reward: 2.1588242685
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 2.1588242685038357, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.16)
45% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 62
\-------------------------
Environment.reset(): Trial set up with start = (4, 4), destination = (1, 3), deadline = 20
Simulating trial. . .
epsilon = 0.5434; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: right, reward: 1.42913216986
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.4291321698573032, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.43)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: left, reward: -10.5636268884
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', 'right'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': -10.563626888408226, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'right')
Agent attempted driving left through a red light. (rewarded -10.56)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 2.31120751043
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.3112075104348118, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.31)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 2.563546535
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.563546534998037, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.56)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 2.56276683101
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.562766831011352, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.56)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: forward, reward: 1.21625365215
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.216253652150612, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.22)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: forward, reward: 2.64193872364
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 2.6419387236423573, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.64)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: left, reward: 1.77184201785
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 1.7718420178505794, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 1.77)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: forward, reward: -10.572924852
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': -10.572924851959812, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -10.57)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: None, reward: -5.00903550974
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 11, 't': 9, 'action': None, 'reward': -5.009035509735884, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.01)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: None, reward: -5.11348309774
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 10, 't': 10, 'action': None, 'reward': -5.1134830977426144, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.11)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: right, reward: -20.854490193
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 9, 't': 11, 'action': 'right', 'reward': -20.854490193005965, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.85)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: right, reward: 1.1149686219
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 1.114968621895011, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.11)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: right, reward: 1.51246085422
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 7, 't': 13, 'action': 'right', 'reward': 1.5124608542245463, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.51)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: forward, reward: -0.420174253446
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'forward'), 'deadline': 6, 't': 14, 'action': 'forward', 'reward': -0.4201742534459415, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'forward')
Agent drove forward instead of right. (rewarded -0.42)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: None, reward: -4.53502483267
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'forward', 'right'), 'deadline': 5, 't': 15, 'action': None, 'reward': -4.535024832672095, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -4.54)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: right, reward: 2.29211607005
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 4, 't': 16, 'action': 'right', 'reward': 2.292116070052641, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.29)
15% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 63
\-------------------------
Environment.reset(): Trial set up with start = (5, 2), destination = (3, 4), deadline = 20
Simulating trial. . .
epsilon = 0.5379; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: right, reward: 2.0023540962
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.0023540961977204, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent followed the waypoint right. (rewarded 2.00)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: left, reward: -10.3805979693
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': 'left', 'reward': -10.380597969279496, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -10.38)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: right, reward: 1.46606933309
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.4660693330857595, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.47)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: left, reward: 1.62177470006
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 1.6217747000583906, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent drove left instead of forward. (rewarded 1.62)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: left, reward: 1.62101958311
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 1.621019583113441, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent drove left instead of right. (rewarded 1.62)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: right, reward: 1.16382098692
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 1.163820986919013, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.16)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: left, reward: 1.04248819748
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 1.042488197476513, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 1.04)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: None, reward: 0.920791445377
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 13, 't': 7, 'action': None, 'reward': 0.92079144537736, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 0.92)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: right, reward: 0.284028610141
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 0.28402861014063674, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent drove right instead of left. (rewarded 0.28)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: right, reward: 1.82777386878
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 1.8277738687841616, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 1.83)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: forward, reward: -9.32585873831
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': -9.325858738314743, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.33)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: forward, reward: 2.11431740605
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 2.1143174060509686, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.11)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: right, reward: 0.55111207854
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 0.551112078539824, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent drove right instead of forward. (rewarded 0.55)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: forward, reward: -9.80836205211
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': -9.80836205211007, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.81)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: None, reward: 2.43607998908
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 6, 't': 14, 'action': None, 'reward': 2.4360799890766707, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.44)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: forward, reward: -10.3752094943
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': -10.375209494306906, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.38)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: None, reward: 2.11136710706
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 4, 't': 16, 'action': None, 'reward': 2.1113671070552607, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.11)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (3, 5), heading: (-1, 0), action: left, reward: 1.11504752493
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 3, 't': 17, 'action': 'left', 'reward': 1.1150475249341905, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 1.12)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: right, reward: 0.631825567656
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 2, 't': 18, 'action': 'right', 'reward': 0.6318255676564137, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 0.63)
5% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 64
\-------------------------
Environment.reset(): Trial set up with start = (1, 3), destination = (4, 6), deadline = 30
Simulating trial. . .
epsilon = 0.5326; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: right, reward: 1.91829927065
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 1.9182992706547155, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.92)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.29293861331
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.292938613308369, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.29)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.85533687949
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.855336879494347, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.86)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.76229709558
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.762297095577753, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.76)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.33603196857
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 26, 't': 4, 'action': None, 'reward': 1.3360319685675666, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.34)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.1593048455
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 25, 't': 5, 'action': None, 'reward': 1.159304845499264, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.16)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 2.8855636512
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 2.8855636511966942, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.89)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: 2.50305659246
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 2.5030565924569843, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.50)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: right, reward: 0.641887029152
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 22, 't': 8, 'action': 'right', 'reward': 0.6418870291515409, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove right instead of left. (rewarded 0.64)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: forward, reward: 1.68649550724
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 1.6864955072414631, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.69)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: forward, reward: 2.25195354946
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 2.2519535494563754, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.25)
63% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 65
\-------------------------
Environment.reset(): Trial set up with start = (5, 2), destination = (8, 5), deadline = 30
Simulating trial. . .
epsilon = 0.5273; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: right, reward: 1.43802491501
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 1.4380249150089421, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.44)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: forward, reward: 1.77809081114
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 29, 't': 1, 'action': 'forward', 'reward': 1.778090811136312, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 1.78)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: forward, reward: 0.804343564064
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 28, 't': 2, 'action': 'forward', 'reward': 0.8043435640639957, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 0.80)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: right, reward: 2.5287018598
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 27, 't': 3, 'action': 'right', 'reward': 2.5287018597992312, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.53)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: forward, reward: -9.84914901036
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': -9.849149010363988, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -9.85)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: left, reward: 1.55747758649
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 25, 't': 5, 'action': 'left', 'reward': 1.5574775864911938, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded 1.56)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: right, reward: 2.75586006754
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'left'), 'deadline': 24, 't': 6, 'action': 'right', 'reward': 2.7558600675425664, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'left')
Agent followed the waypoint right. (rewarded 2.76)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 5), heading: (0, 1), action: right, reward: 0.279196360501
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'right', 'reward': 0.27919636050050045, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.28)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 5), heading: (0, 1), action: None, reward: 2.25994754563
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.2599475456278437, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.26)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: right, reward: 0.139475331363
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 21, 't': 9, 'action': 'right', 'reward': 0.1394753313630237, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent drove right instead of left. (rewarded 0.14)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: forward, reward: 1.80609433334
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 1.8060943333434354, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent drove forward instead of right. (rewarded 1.81)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: forward, reward: -0.0339633862179
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 11, 'action': 'forward', 'reward': -0.033963386217909064, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded -0.03)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: None, reward: 1.27927063298
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 12, 'action': None, 'reward': 1.2792706329762793, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.28)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 5), heading: (-1, 0), action: forward, reward: 0.909959749539
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 0.9099597495390024, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 0.91)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: forward, reward: 1.49119888225
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 16, 't': 14, 'action': 'forward', 'reward': 1.491198882245801, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.49)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: forward, reward: -9.62559768406
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 15, 't': 15, 'action': 'forward', 'reward': -9.625597684058386, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent attempted driving forward through a red light. (rewarded -9.63)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: forward, reward: 1.13682460308
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 16, 'action': 'forward', 'reward': 1.1368246030818527, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.14)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: right, reward: 0.403876985392
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 17, 'action': 'right', 'reward': 0.4038769853921882, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.40)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: forward, reward: 0.871467060415
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 12, 't': 18, 'action': 'forward', 'reward': 0.8714670604148609, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove forward instead of left. (rewarded 0.87)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: left, reward: -19.766210759
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 11, 't': 19, 'action': 'left', 'reward': -19.76621075895844, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.77)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: right, reward: 1.57692863959
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 10, 't': 20, 'action': 'right', 'reward': 1.5769286395907325, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 1.58)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 1.23982133677
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 9, 't': 21, 'action': 'right', 'reward': 1.239821336770646, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.24)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: None, reward: -0.32150092287
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 8, 't': 22, 'action': None, 'reward': -0.32150092286969323, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded -0.32)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: right, reward: 1.3340672528
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', 'left'), 'deadline': 7, 't': 23, 'action': 'right', 'reward': 1.3340672527998323, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'left')
Agent followed the waypoint right. (rewarded 1.33)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: forward, reward: 1.49305911016
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 6, 't': 24, 'action': 'forward', 'reward': 1.49305911015873, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.49)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: right, reward: 0.347027329858
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 5, 't': 25, 'action': 'right', 'reward': 0.3470273298578621, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent drove right instead of left. (rewarded 0.35)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: right, reward: 1.07463139391
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 4, 't': 26, 'action': 'right', 'reward': 1.0746313939143632, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.07)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: right, reward: 1.60825345082
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 3, 't': 27, 'action': 'right', 'reward': 1.6082534508244124, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.61)
7% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: right, reward: 2.04115229777
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 2, 't': 28, 'action': 'right', 'reward': 2.041152297766753, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.04)
3% of time remaining to reach destination.
/-------------------
| Step 29 Results
\-------------------
Environment.step(): t = 29
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 5), heading: (0, 1), action: left, reward: 0.646465219652
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 1, 't': 29, 'action': 'left', 'reward': 0.6464652196519984, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.65)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 66
\-------------------------
Environment.reset(): Trial set up with start = (5, 2), destination = (3, 6), deadline = 20
Simulating trial. . .
epsilon = 0.5220; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: forward, reward: 2.38510315201
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 2.3851031520073804, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.39)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: -5.60190677632
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': -5.601906776318939, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.60)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: forward, reward: 1.90143238877
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 1.901432388770212, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.90)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 7), heading: (0, -1), action: right, reward: 1.58766841268
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.5876684126792522, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.59)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: forward, reward: 2.5044633232
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.5044633232042592, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.50)
75% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 67
\-------------------------
Environment.reset(): Trial set up with start = (7, 5), destination = (3, 2), deadline = 35
Simulating trial. . .
epsilon = 0.5169; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: right, reward: 1.67303357041
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 35, 't': 0, 'action': 'right', 'reward': 1.6730335704127057, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.67)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 2.94224773209
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 34, 't': 1, 'action': None, 'reward': 2.9422477320883744, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.94)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 2.59906961387
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 33, 't': 2, 'action': None, 'reward': 2.5990696138730636, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.60)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 1.57342035111
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 32, 't': 3, 'action': None, 'reward': 1.5734203511111426, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.57)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 2.86997410873
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 31, 't': 4, 'action': None, 'reward': 2.8699741087288455, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.87)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: 1.83955549816
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 30, 't': 5, 'action': 'forward', 'reward': 1.8395554981631526, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.84)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: None, reward: 2.29686765204
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 29, 't': 6, 'action': None, 'reward': 2.296867652044676, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.30)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: left, reward: -0.0218391566344
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 28, 't': 7, 'action': 'left', 'reward': -0.021839156634373746, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded -0.02)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: right, reward: 2.91806907947
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 27, 't': 8, 'action': 'right', 'reward': 2.918069079467553, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.92)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: left, reward: -10.3135944689
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 26, 't': 9, 'action': 'left', 'reward': -10.313594468865245, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.31)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: forward, reward: 2.72494548287
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 25, 't': 10, 'action': 'forward', 'reward': 2.724945482874346, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.72)
69% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: forward, reward: -10.224817866
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 24, 't': 11, 'action': 'forward', 'reward': -10.224817866007564, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.22)
66% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: right, reward: 0.497744798119
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 23, 't': 12, 'action': 'right', 'reward': 0.49774479811853223, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent drove right instead of left. (rewarded 0.50)
63% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: forward, reward: -40.1824770284
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 22, 't': 13, 'action': 'forward', 'reward': -40.18247702842993, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.18)
60% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: None, reward: 2.30691651988
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 21, 't': 14, 'action': None, 'reward': 2.306916519877383, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.31)
57% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: left, reward: -10.0936110207
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 20, 't': 15, 'action': 'left', 'reward': -10.093611020666682, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -10.09)
54% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: left, reward: 0.43439661286
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 16, 'action': 'left', 'reward': 0.4343966128602216, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 0.43)
51% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: -40.6639457545
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 18, 't': 17, 'action': 'forward', 'reward': -40.66394575449991, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.66)
49% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: right, reward: 0.810993654649
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 17, 't': 18, 'action': 'right', 'reward': 0.810993654649276, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 0.81)
46% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: forward, reward: 0.904745867291
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 16, 't': 19, 'action': 'forward', 'reward': 0.9047458672906091, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent drove forward instead of right. (rewarded 0.90)
43% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: right, reward: 2.68706993635
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 15, 't': 20, 'action': 'right', 'reward': 2.687069936354768, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 2.69)
40% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: right, reward: -0.310497194503
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 14, 't': 21, 'action': 'right', 'reward': -0.31049719450252145, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded -0.31)
37% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: right, reward: 2.58286147369
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 13, 't': 22, 'action': 'right', 'reward': 2.582861473687263, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.58)
34% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: right, reward: 1.77405116272
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 12, 't': 23, 'action': 'right', 'reward': 1.7740511627151214, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.77)
31% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: -5.43167302511
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 11, 't': 24, 'action': None, 'reward': -5.431673025110116, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.43)
29% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: right, reward: 1.29919209657
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'left'), 'deadline': 10, 't': 25, 'action': 'right', 'reward': 1.299192096567495, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'left')
Agent followed the waypoint right. (rewarded 1.30)
26% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: left, reward: 1.80182614393
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 9, 't': 26, 'action': 'left', 'reward': 1.801826143932178, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.80)
23% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 68
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (2, 5), deadline = 30
Simulating trial. . .
epsilon = 0.5117; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: right, reward: 1.01814422436
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 1.0181442243595424, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.02)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: None, reward: -5.59068504163
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 29, 't': 1, 'action': None, 'reward': -5.590685041633943, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.59)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: right, reward: 1.77579027437
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 28, 't': 2, 'action': 'right', 'reward': 1.7757902743683782, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.78)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: None, reward: 2.56711427318
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 27, 't': 3, 'action': None, 'reward': 2.5671142731758767, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.57)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: left, reward: -9.71821820621
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 26, 't': 4, 'action': 'left', 'reward': -9.718218206211835, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.72)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: None, reward: 1.50156732155
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 25, 't': 5, 'action': None, 'reward': 1.5015673215534127, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.50)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: left, reward: -19.1031097069
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 24, 't': 6, 'action': 'left', 'reward': -19.103109706878858, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.10)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: left, reward: -19.2685483121
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 23, 't': 7, 'action': 'left', 'reward': -19.26854831213712, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.27)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: forward, reward: 2.68131979366
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 22, 't': 8, 'action': 'forward', 'reward': 2.681319793658239, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.68)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 5), heading: (0, -1), action: left, reward: 1.82551534402
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 9, 'action': 'left', 'reward': 1.82551534402347, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.83)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: right, reward: 2.3447323281
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 10, 'action': 'right', 'reward': 2.3447323280970713, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.34)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: None, reward: 2.07344485456
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 11, 'action': None, 'reward': 2.0734448545585344, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.07)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: None, reward: 1.02932440315
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 18, 't': 12, 'action': None, 'reward': 1.0293244031512288, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.03)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: forward, reward: 2.6479582839
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 2.647958283897733, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.65)
53% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 69
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (1, 6), deadline = 20
Simulating trial. . .
epsilon = 0.5066; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: right, reward: 1.58375938884
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.5837593888414745, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent followed the waypoint right. (rewarded 1.58)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: left, reward: 1.06181931075
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 19, 't': 1, 'action': 'left', 'reward': 1.0618193107514666, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 1.06)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: right, reward: 2.02159112963
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.0215911296318025, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.02)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: right, reward: 1.43268936411
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.4326893641065037, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.43)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: forward, reward: 1.23454735328
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.2345473532810598, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.23)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: forward, reward: -10.5965790432
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': -10.59657904317147, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.60)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: forward, reward: 1.1133366058
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.1133366058022178, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.11)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 1.2004268399
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.20042683990362, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.20)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 2.40122611425
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.4012261142544244, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.40)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: 2.26405902496
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 2.2640590249620396, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 2.26)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: left, reward: 0.0217994218185
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 10, 't': 10, 'action': 'left', 'reward': 0.021799421818517906, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 0.02)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: right, reward: 2.05596082915
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 2.055960829149983, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 2.06)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: forward, reward: 1.01999024213
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'forward'), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': 1.019990242132495, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'forward')
Agent drove forward instead of right. (rewarded 1.02)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: -0.229098735724
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'forward'), 'deadline': 7, 't': 13, 'action': None, 'reward': -0.22909873572352446, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded -0.23)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: left, reward: -9.82711327975
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 6, 't': 14, 'action': 'left', 'reward': -9.827113279747868, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.83)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: right, reward: 1.83882012495
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 5, 't': 15, 'action': 'right', 'reward': 1.8388201249485847, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.84)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: right, reward: 1.1819832111
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 4, 't': 16, 'action': 'right', 'reward': 1.1819832111035788, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.18)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: -4.30629313301
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 3, 't': 17, 'action': None, 'reward': -4.306293133006912, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.31)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: forward, reward: 0.861096705341
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 2, 't': 18, 'action': 'forward', 'reward': 0.8610967053412721, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.86)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 0.99863239923
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 1, 't': 19, 'action': None, 'reward': 0.9986323992296628, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.00)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 70
\-------------------------
Environment.reset(): Trial set up with start = (5, 4), destination = (7, 7), deadline = 25
Simulating trial. . .
epsilon = 0.5016; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: right, reward: 1.62491115644
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.6249111564382184, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.62)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: left, reward: -9.21322997221
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': 'left', 'reward': -9.213229972213256, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.21)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: right, reward: 1.99193215741
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.99193215741127, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 1.99)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 4), heading: (0, 1), action: right, reward: 1.55380464911
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 22, 't': 3, 'action': 'right', 'reward': 1.5538046491084072, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent drove right instead of forward. (rewarded 1.55)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 4), heading: (0, 1), action: left, reward: -9.98411143598
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 21, 't': 4, 'action': 'left', 'reward': -9.984111435980784, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.98)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: right, reward: 0.546143993358
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 0.5461439933577198, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.55)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: right, reward: 1.69788308399
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 1.6978830839893961, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.70)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: right, reward: 1.62790649958
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 1.6279064995795196, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.63)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: left, reward: -40.8611938315
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 8, 'action': 'left', 'reward': -40.86119383147248, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.86)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: left, reward: 0.716192661753
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'left', 'reward': 0.7161926617525498, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 0.72)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: right, reward: 2.19587340148
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 2.1958734014774155, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.20)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: right, reward: 0.127177930337
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 0.12717793033728453, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.13)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: right, reward: 2.00639160502
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 2.0063916050174906, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.01)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: forward, reward: -0.0288577923796
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': -0.02885779237959829, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded -0.03)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: -5.76169191763
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 11, 't': 14, 'action': None, 'reward': -5.761691917627906, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.76)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: right, reward: 1.68787913472
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 10, 't': 15, 'action': 'right', 'reward': 1.6878791347242668, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.69)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: right, reward: 1.51142749795
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 9, 't': 16, 'action': 'right', 'reward': 1.5114274979548068, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 1.51)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: None, reward: 0.614121744475
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 8, 't': 17, 'action': None, 'reward': 0.6141217444748563, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.61)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: right, reward: 1.52655251605
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 7, 't': 18, 'action': 'right', 'reward': 1.5265525160499411, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 1.53)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: left, reward: 2.33663154077
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 6, 't': 19, 'action': 'left', 'reward': 2.336631540765387, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.34)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: left, reward: 0.724812540823
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 5, 't': 20, 'action': 'left', 'reward': 0.7248125408226811, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 0.72)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: forward, reward: 1.09533684814
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 4, 't': 21, 'action': 'forward', 'reward': 1.0953368481359114, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.10)
12% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 71
\-------------------------
Environment.reset(): Trial set up with start = (2, 4), destination = (5, 3), deadline = 20
Simulating trial. . .
epsilon = 0.4966; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: left, reward: 1.45307828209
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.453078282091662, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 1.45)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: right, reward: 2.2480867481
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 2.248086748095999, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.25)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: right, reward: 1.28213377402
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.2821337740237766, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent drove right instead of forward. (rewarded 1.28)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: left, reward: 2.70286831414
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 2.7028683141417558, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.70)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: left, reward: -10.9097346868
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': -10.909734686824388, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.91)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: left, reward: -10.3702833453
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': -10.370283345305108, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.37)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 3), heading: (0, -1), action: left, reward: 0.255093712381
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 0.25509371238111767, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded 0.26)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: right, reward: 2.0223994349
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 2.0223994348989773, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.02)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 72
\-------------------------
Environment.reset(): Trial set up with start = (1, 4), destination = (4, 5), deadline = 20
Simulating trial. . .
epsilon = 0.4916; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: left, reward: -39.3874359776
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'right', 'forward'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': -39.38743597755633, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.39)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: None, reward: 0.75571117012
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 19, 't': 1, 'action': None, 'reward': 0.7557111701204727, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 0.76)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: None, reward: 1.31982803401
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.3198280340110298, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.32)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: left, reward: 0.641838623526
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 0.6418386235257546, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 0.64)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: left, reward: -40.695363688
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 16, 't': 4, 'action': 'left', 'reward': -40.69536368801527, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.70)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: None, reward: 1.06122373992
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.061223739924728, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.06)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: left, reward: -10.3061588895
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 14, 't': 6, 'action': 'left', 'reward': -10.306158889529282, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -10.31)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: forward, reward: -0.0433270768809
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': -0.04332707688085291, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent drove forward instead of left. (rewarded -0.04)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 2.04622750281
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.0462275028095824, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.05)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 1.96932408108
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.9693240810757675, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.97)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: forward, reward: 1.86863028664
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 1.868630286640558, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.87)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 1.77638472227
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 9, 't': 11, 'action': None, 'reward': 1.7763847222725564, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.78)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 2.10832830999
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 8, 't': 12, 'action': None, 'reward': 2.108328309994796, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.11)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: forward, reward: 1.51397615104
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 1.5139761510440526, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.51)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: None, reward: -4.81253880657
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 6, 't': 14, 'action': None, 'reward': -4.812538806568698, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.81)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: left, reward: 0.442862145662
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 5, 't': 15, 'action': 'left', 'reward': 0.4428621456622326, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove left instead of forward. (rewarded 0.44)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: right, reward: 0.558018631462
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 4, 't': 16, 'action': 'right', 'reward': 0.5580186314617159, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 0.56)
15% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 73
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (3, 6), deadline = 20
Simulating trial. . .
epsilon = 0.4868; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: 1.35919856034
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.3591985603420578, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.36)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: forward, reward: 2.45422150015
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 2.4542215001453673, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 2.45)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: right, reward: 1.48269201723
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.4826920172338771, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 1.48)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: -5.81584555175
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': -5.815845551748924, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.82)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: 2.49020061368
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.490200613680857, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.49)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: 2.46048995163
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.4604899516257044, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.46)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: forward, reward: -9.12872623563
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': -9.12872623562569, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.13)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: forward, reward: 0.415542599895
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 0.41554259989485365, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove forward instead of left. (rewarded 0.42)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: right, reward: 1.75650755952
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'left'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.756507559520113, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'left')
Agent drove right instead of left. (rewarded 1.76)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: forward, reward: 0.418406157396
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 0.41840615739618436, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent drove forward instead of right. (rewarded 0.42)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: right, reward: 2.64608880004
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 2.646088800035672, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.65)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: right, reward: 1.71383335233
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 1.7138333523256146, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 1.71)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: forward, reward: 1.05084428837
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': 1.0508442883657254, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.05)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: None, reward: 1.99774766523
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 7, 't': 13, 'action': None, 'reward': 1.9977476652287411, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 2.00)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: None, reward: 1.88906523667
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.8890652366674485, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.89)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: forward, reward: -9.59466359193
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': -9.594663591933948, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.59)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: forward, reward: 0.541144502477
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 4, 't': 16, 'action': 'forward', 'reward': 0.5411445024769272, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 0.54)
15% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 74
\-------------------------
Environment.reset(): Trial set up with start = (8, 5), destination = (3, 4), deadline = 20
Simulating trial. . .
epsilon = 0.4819; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 5), heading: (0, 1), action: left, reward: -10.1955944479
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': -10.195594447939898, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -10.20)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 5), heading: (0, 1), action: right, reward: -19.4857832329
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': -19.485783232856043, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.49)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 5), heading: (0, 1), action: None, reward: 2.06544253008
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.0654425300790127, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.07)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 5), heading: (0, 1), action: None, reward: 2.34590952427
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.34590952427345, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.35)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 5), heading: (0, 1), action: None, reward: 2.5192226832
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.519222683198085, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.52)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: forward, reward: 1.89162285947
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'right'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.8916228594732534, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'right')
Agent drove forward instead of left. (rewarded 1.89)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 1.10062111273
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.100621112732977, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.10)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: left, reward: 2.35011776981
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 2.3501177698131577, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent followed the waypoint left. (rewarded 2.35)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: 1.45660806227
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 1.4566080622720645, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.46)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: 2.26571795602
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 2.265717956015382, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.27)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 2.64230481968
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 10, 't': 10, 'action': None, 'reward': 2.642304819676422, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.64)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 5), heading: (0, -1), action: left, reward: 1.18415055157
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 1.184150551572824, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.18)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: right, reward: 1.41193214758
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 1.4119321475757944, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 1.41)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 2.21270791207
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 7, 't': 13, 'action': None, 'reward': 2.2127079120722923, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.21)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 1.14579963511
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.1457996351105828, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.15)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: left, reward: -40.4026321808
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 5, 't': 15, 'action': 'left', 'reward': -40.40263218077328, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.40)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: left, reward: -9.99524382537
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 4, 't': 16, 'action': 'left', 'reward': -9.995243825371748, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -10.00)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: -5.58005297074
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 3, 't': 17, 'action': None, 'reward': -5.5800529707392, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -5.58)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: forward, reward: 0.515404698076
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 2, 't': 18, 'action': 'forward', 'reward': 0.5154046980762328, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 0.52)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: None, reward: 2.11594718041
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'forward'), 'deadline': 1, 't': 19, 'action': None, 'reward': 2.115947180414758, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 2.12)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 75
\-------------------------
Environment.reset(): Trial set up with start = (2, 3), destination = (5, 2), deadline = 20
Simulating trial. . .
epsilon = 0.4771; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: None, reward: 1.12478006073
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', 'forward'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.1247800607267302, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 1.12)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: forward, reward: -9.93114575409
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': -9.931145754090608, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.93)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: left, reward: -10.8128982261
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': -10.81289822609417, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.81)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: right, reward: 1.23227627707
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.232276277066222, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.23)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 2.0638292441
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.06382924410019, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.06)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: 1.32360571394
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.3236057139362214, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.32)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: -10.0841771618
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': -10.084177161833912, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent attempted driving forward through a red light. (rewarded -10.08)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: forward, reward: 1.77325083033
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.7732508303334134, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.77)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: None, reward: 2.33884659027
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.3388465902726523, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.34)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: forward, reward: -10.7548941296
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': -10.754894129641324, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.75)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: right, reward: 1.53017781202
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 1.5301778120224032, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent drove right instead of left. (rewarded 1.53)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: right, reward: 2.21060508516
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 2.2106050851604166, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.21)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 3), heading: (0, -1), action: right, reward: 2.28216804292
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 2.2821680429237508, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.28)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 3), heading: (0, -1), action: forward, reward: -40.9903974113
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': -40.99039741125794, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.99)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 3), heading: (0, -1), action: None, reward: 1.21960217751
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.219602177513793, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.22)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: right, reward: 1.72407319781
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 5, 't': 15, 'action': 'right', 'reward': 1.7240731978118882, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.72)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: right, reward: 0.582828118192
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'right'), 'deadline': 4, 't': 16, 'action': 'right', 'reward': 0.5828281181920054, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'right')
Agent drove right instead of left. (rewarded 0.58)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: right, reward: 0.767307357305
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 3, 't': 17, 'action': 'right', 'reward': 0.7673073573052225, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 0.77)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (4, 3), heading: (0, -1), action: right, reward: 0.616355459641
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 2, 't': 18, 'action': 'right', 'reward': 0.6163554596412104, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 0.62)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: right, reward: 0.211591693734
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 1, 't': 19, 'action': 'right', 'reward': 0.21159169373422615, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 0.21)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 76
\-------------------------
Environment.reset(): Trial set up with start = (5, 3), destination = (2, 7), deadline = 25
Simulating trial. . .
epsilon = 0.4724; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: forward, reward: 0.085561392479
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 0.0855613924789902, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove forward instead of left. (rewarded 0.09)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: None, reward: 2.96725704592
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.967257045919868, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.97)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: None, reward: 2.49031392269
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.490313922687668, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.49)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: right, reward: 1.76548372897
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': 'right', 'reward': 1.7654837289689513, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent drove right instead of left. (rewarded 1.77)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: right, reward: 1.08553998899
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 1.0855399889896211, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 1.09)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: right, reward: 1.25281031068
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 1.2528103106795756, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.25)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: forward, reward: 1.58338455507
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 1.5833845550737808, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.58)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: forward, reward: 1.49047272706
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.490472727056065, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.49)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: -4.11430583454
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 17, 't': 8, 'action': None, 'reward': -4.114305834540636, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -4.11)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: 2.44601058843
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': 2.4460105884280514, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.45)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: 2.37790357387
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 10, 'action': None, 'reward': 2.3779035738654057, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.38)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: forward, reward: 1.9023752156
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 1.9023752155996223, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.90)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: right, reward: 1.60930910262
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 1.6093091026204729, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.61)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: None, reward: 1.38281124862
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 13, 'action': None, 'reward': 1.3828112486180935, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.38)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: None, reward: 1.29808535662
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 11, 't': 14, 'action': None, 'reward': 1.2980853566177548, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.30)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: forward, reward: 1.21931631558
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 10, 't': 15, 'action': 'forward', 'reward': 1.2193163155846494, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.22)
36% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 77
\-------------------------
Environment.reset(): Trial set up with start = (2, 7), destination = (7, 5), deadline = 25
Simulating trial. . .
epsilon = 0.4677; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: left, reward: 1.91521970519
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 1.9152197051906499, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.92)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: forward, reward: 2.05996789149
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 2.059967891488471, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 2.06)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: -4.42866564145
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 23, 't': 2, 'action': None, 'reward': -4.4286656414541365, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -4.43)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 1.49945952138
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.499459521376139, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.50)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 1.83958100102
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.839581001020239, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 1.84)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: right, reward: 2.66312368923
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 2.663123689228048, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.66)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: None, reward: 0.770602917381
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 19, 't': 6, 'action': None, 'reward': 0.7706029173812979, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 0.77)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: left, reward: 1.51811679831
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 1.518116798312325, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent drove left instead of right. (rewarded 1.52)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 0.299521842091
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 0.29952184209099986, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.30)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 1.57040201567
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.570402015667885, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.57)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 0.0161144923305
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 15, 't': 10, 'action': None, 'reward': 0.016114492330464247, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.02)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: left, reward: 0.922292334319
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'left', 'reward': 0.9222923343194022, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 0.92)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: left, reward: 2.20557326381
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 13, 't': 12, 'action': 'left', 'reward': 2.205573263809036, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.21)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: None, reward: 1.7491462402
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 13, 'action': None, 'reward': 1.7491462402027262, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.75)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: forward, reward: -40.5720920763
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 11, 't': 14, 'action': 'forward', 'reward': -40.57209207625655, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.57)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: forward, reward: 2.20533957474
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 10, 't': 15, 'action': 'forward', 'reward': 2.2053395747399565, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.21)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: forward, reward: 0.745884025671
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 9, 't': 16, 'action': 'forward', 'reward': 0.7458840256712586, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 0.75)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: left, reward: -9.2078909457
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 8, 't': 17, 'action': 'left', 'reward': -9.207890945703447, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -9.21)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: left, reward: -9.81218807036
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 7, 't': 18, 'action': 'left', 'reward': -9.812188070357779, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.81)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: left, reward: 1.55794198417
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 6, 't': 19, 'action': 'left', 'reward': 1.5579419841708446, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 1.56)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: forward, reward: -9.16132181263
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 5, 't': 20, 'action': 'forward', 'reward': -9.161321812628563, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.16)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: None, reward: 2.05681174758
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 4, 't': 21, 'action': None, 'reward': 2.056811747581764, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.06)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: right, reward: -0.123721817033
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'left'), 'deadline': 3, 't': 22, 'action': 'right', 'reward': -0.12372181703305585, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'left')
Agent drove right instead of left. (rewarded -0.12)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: left, reward: 0.755433551907
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 2, 't': 23, 'action': 'left', 'reward': 0.7554335519067257, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 0.76)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 0.689215000522
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 1, 't': 24, 'action': None, 'reward': 0.6892150005220645, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.69)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 78
\-------------------------
Environment.reset(): Trial set up with start = (3, 5), destination = (7, 5), deadline = 20
Simulating trial. . .
epsilon = 0.4630; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: -4.0663826542
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 20, 't': 0, 'action': None, 'reward': -4.06638265420407, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.07)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: right, reward: 1.88229589289
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.8822958928879572, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.88)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: left, reward: -9.96138362273
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': -9.961383622726197, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.96)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: None, reward: -5.70407767845
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': -5.704077678454933, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.70)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: right, reward: 1.06351690958
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 1.0635169095819161, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent followed the waypoint right. (rewarded 1.06)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: forward, reward: 1.32748062858
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.3274806285842746, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.33)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: left, reward: 0.918300874196
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 0.9183008741961509, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove left instead of forward. (rewarded 0.92)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: forward, reward: 0.366741695772
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 0.36674169577230575, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent drove forward instead of right. (rewarded 0.37)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: left, reward: -40.7435662058
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 12, 't': 8, 'action': 'left', 'reward': -40.74356620583848, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.74)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: forward, reward: 0.448854863133
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 0.4488548631329934, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent drove forward instead of right. (rewarded 0.45)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: right, reward: 2.01781757173
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', 'left'), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 2.017817571729357, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'left')
Agent followed the waypoint right. (rewarded 2.02)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: right, reward: -0.166486485353
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 9, 't': 11, 'action': 'right', 'reward': -0.1664864853528052, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove right instead of forward. (rewarded -0.17)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: None, reward: 1.31447109901
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'left'), 'deadline': 8, 't': 12, 'action': None, 'reward': 1.3144710990084645, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'left')
Agent properly idled at a red light. (rewarded 1.31)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: right, reward: 0.410654999837
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 7, 't': 13, 'action': 'right', 'reward': 0.4106549998374339, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.41)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 1.54217926638
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.5421792663762461, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.54)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: left, reward: 0.898624053746
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 5, 't': 15, 'action': 'left', 'reward': 0.8986240537456058, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.90)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: left, reward: 2.13191313621
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 4, 't': 16, 'action': 'left', 'reward': 2.1319131362111223, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.13)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 0.699674755156
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 3, 't': 17, 'action': None, 'reward': 0.6996747551557743, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.70)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 0.997348663839
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 2, 't': 18, 'action': None, 'reward': 0.9973486638391893, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.00)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 0.971410441525
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 1, 't': 19, 'action': 'forward', 'reward': 0.9714104415250024, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 0.97)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 79
\-------------------------
Environment.reset(): Trial set up with start = (5, 4), destination = (1, 7), deadline = 35
Simulating trial. . .
epsilon = 0.4584; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: forward, reward: 2.45890684379
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 35, 't': 0, 'action': 'forward', 'reward': 2.4589068437932333, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.46)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: None, reward: -5.79504820647
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 34, 't': 1, 'action': None, 'reward': -5.795048206471771, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.80)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: forward, reward: 2.18907420649
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 33, 't': 2, 'action': 'forward', 'reward': 2.1890742064872692, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.19)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: forward, reward: 1.28551205523
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 32, 't': 3, 'action': 'forward', 'reward': 1.2855120552262367, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.29)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: forward, reward: 2.17571807706
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 31, 't': 4, 'action': 'forward', 'reward': 2.1757180770606785, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 2.18)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: right, reward: 1.94278305645
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 30, 't': 5, 'action': 'right', 'reward': 1.942783056452875, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 1.94)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: left, reward: -40.7700064413
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 29, 't': 6, 'action': 'left', 'reward': -40.77000644129267, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.77)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: left, reward: 1.68194610131
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 28, 't': 7, 'action': 'left', 'reward': 1.6819461013060923, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.68)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: left, reward: -39.0298327717
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 27, 't': 8, 'action': 'left', 'reward': -39.029832771678166, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.03)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: right, reward: -20.2101386252
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 26, 't': 9, 'action': 'right', 'reward': -20.210138625193512, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.21)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: left, reward: 1.36666840795
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 25, 't': 10, 'action': 'left', 'reward': 1.3666684079485898, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 1.37)
69% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: None, reward: 1.09945242571
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 24, 't': 11, 'action': None, 'reward': 1.0994524257072276, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.10)
66% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: None, reward: 1.50698320631
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 12, 'action': None, 'reward': 1.5069832063106448, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.51)
63% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: forward, reward: 1.66031438461
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 22, 't': 13, 'action': 'forward', 'reward': 1.6603143846072772, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 1.66)
60% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: right, reward: 0.982973627748
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 21, 't': 14, 'action': 'right', 'reward': 0.9829736277475681, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.98)
57% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: left, reward: 1.31861092778
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 20, 't': 15, 'action': 'left', 'reward': 1.3186109277790499, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.32)
54% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: left, reward: 1.66559985879
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 19, 't': 16, 'action': 'left', 'reward': 1.665599858786025, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.67)
51% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: -4.71378786008
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 17, 'action': None, 'reward': -4.713787860083147, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.71)
49% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: right, reward: 0.36656636184
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 17, 't': 18, 'action': 'right', 'reward': 0.36656636184015623, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove right instead of forward. (rewarded 0.37)
46% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: right, reward: 0.429283958177
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 16, 't': 19, 'action': 'right', 'reward': 0.4292839581767879, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 0.43)
43% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: right, reward: 1.62485759798
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 15, 't': 20, 'action': 'right', 'reward': 1.6248575979756723, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.62)
40% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: right, reward: 2.02307570292
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 14, 't': 21, 'action': 'right', 'reward': 2.0230757029196313, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 2.02)
37% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: -5.27885292226
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 13, 't': 22, 'action': None, 'reward': -5.278852922260892, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.28)
34% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 1.60500805963
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 12, 't': 23, 'action': None, 'reward': 1.6050080596271314, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.61)
31% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 1.43686404122
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 24, 'action': None, 'reward': 1.4368640412161462, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.44)
29% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 0.828303186127
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 10, 't': 25, 'action': 'forward', 'reward': 0.8283031861272239, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.83)
26% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: right, reward: 0.721605127256
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 9, 't': 26, 'action': 'right', 'reward': 0.7216051272559072, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 0.72)
23% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 80
\-------------------------
Environment.reset(): Trial set up with start = (4, 3), destination = (8, 3), deadline = 20
Simulating trial. . .
epsilon = 0.4538; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: None, reward: -5.55639200448
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'forward', 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': -5.5563920044779, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.56)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: right, reward: 2.92013667898
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 2.9201366789817316, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.92)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: -5.29743236775
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': -5.2974323677480175, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.30)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: right, reward: 0.909264573508
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 0.909264573508409, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent drove right instead of forward. (rewarded 0.91)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: right, reward: 0.375559209362
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 0.37555920936249054, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent drove right instead of left. (rewarded 0.38)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: right, reward: 2.53637819196
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 2.5363781919619792, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.54)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: right, reward: 1.96627180488
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.9662718048820569, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.97)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: right, reward: -19.0594537693
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': -19.059453769313418, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.06)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: 1.81572941066
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'right'), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.8157294106591793, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 1.82)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: forward, reward: 2.47127898277
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 2.471278982770344, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.47)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: None, reward: 2.69880962003
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 10, 't': 10, 'action': None, 'reward': 2.698809620030076, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.70)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: forward, reward: 1.06088636311
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 1.0608863631068264, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent followed the waypoint forward. (rewarded 1.06)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: left, reward: -9.56029145807
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 8, 't': 12, 'action': 'left', 'reward': -9.560291458073992, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.56)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: left, reward: -20.9869327077
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 7, 't': 13, 'action': 'left', 'reward': -20.98693270770146, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.99)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: forward, reward: 1.63413736377
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 6, 't': 14, 'action': 'forward', 'reward': 1.634137363772565, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.63)
25% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 81
\-------------------------
Environment.reset(): Trial set up with start = (2, 5), destination = (5, 4), deadline = 20
Simulating trial. . .
epsilon = 0.4493; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: 2.38828991654
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.3882899165395894, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.39)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: right, reward: 1.95940348335
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.9594034833515352, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent drove right instead of forward. (rewarded 1.96)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: left, reward: -10.6664463414
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': -10.666446341383542, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.67)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: None, reward: 2.43897483512
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.4389748351202503, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.44)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: None, reward: 1.15816283683
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.1581628368296844, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.16)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: forward, reward: 0.544686322442
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 0.5446863224424698, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove forward instead of left. (rewarded 0.54)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: None, reward: 2.84457513795
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.84457513795274, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.84)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: right, reward: 0.489540528418
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 0.48954052841835116, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.49)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: right, reward: 1.48679909796
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.4867990979612045, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent drove right instead of forward. (rewarded 1.49)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: forward, reward: 0.312299955357
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 0.3122999553571293, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove forward instead of left. (rewarded 0.31)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 1.62332004982
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.623320049815175, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.62)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: left, reward: 2.49392777096
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 2.4939277709577468, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.49)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: right, reward: -0.323791681743
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': -0.32379168174250483, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded -0.32)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: left, reward: 1.39260817016
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 7, 't': 13, 'action': 'left', 'reward': 1.392608170161498, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 1.39)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: left, reward: -20.3248151646
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 6, 't': 14, 'action': 'left', 'reward': -20.32481516456099, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.32)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: forward, reward: 1.21006749993
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': 1.2100674999336392, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.21)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 2.31538103991
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 4, 't': 16, 'action': None, 'reward': 2.3153810399145005, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.32)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: forward, reward: 0.68315398606
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 3, 't': 17, 'action': 'forward', 'reward': 0.6831539860600255, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.68)
10% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 82
\-------------------------
Environment.reset(): Trial set up with start = (1, 2), destination = (7, 4), deadline = 20
Simulating trial. . .
epsilon = 0.4449; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 2.08154016858
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 2.0815401685775274, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 2.08)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: 1.24158096848
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 1.24158096847534, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 1.24)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: None, reward: -4.7591656495
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': None, 'reward': -4.759165649504858, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.76)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: left, reward: -39.3454307068
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 17, 't': 3, 'action': 'left', 'reward': -39.34543070679376, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.35)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: left, reward: -10.0404583778
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 16, 't': 4, 'action': 'left', 'reward': -10.040458377847745, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent attempted driving left through a red light. (rewarded -10.04)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: left, reward: 2.88736119353
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 2.887361193530106, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.89)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: None, reward: 0.999536357033
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 14, 't': 6, 'action': None, 'reward': 0.9995363570332358, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.00)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 4), heading: (0, 1), action: forward, reward: 2.04325674289
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.043256742894215, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.04)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 83
\-------------------------
Environment.reset(): Trial set up with start = (1, 2), destination = (6, 6), deadline = 25
Simulating trial. . .
epsilon = 0.4404; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 1.24317054119
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.2431705411852108, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.24)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: right, reward: 0.00189245192735
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 0.0018924519273466611, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 0.00)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: forward, reward: -10.1463437299
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': -10.146343729858444, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.15)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: left, reward: 2.00936853969
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'left', 'reward': 2.009368539690473, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.01)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 1.9713774103
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.9713774102986266, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.97)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: left, reward: -10.7773666712
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': -10.777366671221536, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.78)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.58642378573
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.5864237857262928, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.59)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 1.65824124281
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.6582412428055724, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.66)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: left, reward: 1.19244774607
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'left', 'reward': 1.1924477460706817, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 1.19)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: right, reward: 2.21382265679
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 2.213822656791155, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.21)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: left, reward: -39.5145545102
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 15, 't': 10, 'action': 'left', 'reward': -39.51455451022436, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.51)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: right, reward: 2.69114505883
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 2.691145058828701, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.69)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: None, reward: -5.3389793606
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'left', 'right'), 'deadline': 13, 't': 12, 'action': None, 'reward': -5.338979360601687, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -5.34)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: right, reward: 0.925545762409
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 12, 't': 13, 'action': 'right', 'reward': 0.9255457624090471, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 0.93)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: None, reward: 2.27308049741
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 11, 't': 14, 'action': None, 'reward': 2.2730804974110352, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.27)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: None, reward: -5.68005037216
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 10, 't': 15, 'action': None, 'reward': -5.6800503721566695, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.68)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: left, reward: 1.8836928318
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 9, 't': 16, 'action': 'left', 'reward': 1.88369283180079, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 1.88)
32% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 84
\-------------------------
Environment.reset(): Trial set up with start = (2, 4), destination = (8, 6), deadline = 20
Simulating trial. . .
epsilon = 0.4360; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: right, reward: 1.74775684105
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.74775684104956, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.75)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: right, reward: 2.62707475087
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 2.627074750872681, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.63)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 1.97405892835
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.9740589283539043, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.97)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 2.28524426926
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.2852442692612494, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.29)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: None, reward: 1.15144091776
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.1514409177604805, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.15)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: left, reward: 1.61183003486
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 1.6118300348590584, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 1.61)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 85
\-------------------------
Environment.reset(): Trial set up with start = (7, 5), destination = (4, 4), deadline = 20
Simulating trial. . .
epsilon = 0.4317; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 1.7808601124
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.7808601123993926, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.78)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 2.57894598727
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.578945987274689, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.58)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: left, reward: -40.2187299408
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': -40.21872994084716, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.22)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 2.66627482661
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.6662748266138534, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.67)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 1.79685934184
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.7968593418424552, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.80)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 1.0682018286
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.0682018286044344, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.07)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: -4.16834173911
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 14, 't': 6, 'action': None, 'reward': -4.168341739114073, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.17)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: -4.67440979643
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 13, 't': 7, 'action': None, 'reward': -4.6744097964268, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.67)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: left, reward: -20.5579337265
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'left'}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'right', 'left'), 'deadline': 12, 't': 8, 'action': 'left', 'reward': -20.557933726467514, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'left')
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.56)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: -4.91822660513
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 11, 't': 9, 'action': None, 'reward': -4.918226605132574, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -4.92)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: left, reward: 1.70358400523
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 10, 't': 10, 'action': 'left', 'reward': 1.7035840052343918, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 1.70)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: None, reward: 1.72908496909
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 9, 't': 11, 'action': None, 'reward': 1.7290849690921848, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.73)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: None, reward: 0.801165396533
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 8, 't': 12, 'action': None, 'reward': 0.8011653965330179, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.80)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: forward, reward: 1.56367601186
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 1.5636760118561066, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.56)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: None, reward: 1.04656552047
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.0465655204709239, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.05)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: left, reward: 0.00646733082501
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 5, 't': 15, 'action': 'left', 'reward': 0.0064673308250091655, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 0.01)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: right, reward: 1.60699828719
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 4, 't': 16, 'action': 'right', 'reward': 1.6069982871861388, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.61)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: right, reward: 1.79121623799
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 3, 't': 17, 'action': 'right', 'reward': 1.7912162379902261, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 1.79)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: None, reward: 0.824713663206
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 2, 't': 18, 'action': None, 'reward': 0.8247136632058998, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 0.82)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: None, reward: 0.414154919985
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 1, 't': 19, 'action': None, 'reward': 0.4141549199847472, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.41)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 86
\-------------------------
Environment.reset(): Trial set up with start = (4, 5), destination = (8, 3), deadline = 30
Simulating trial. . .
epsilon = 0.4274; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: right, reward: 0.190532310625
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 0.19053231062507048, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 0.19)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: None, reward: 1.39119787774
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.39119787773669, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.39)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: right, reward: 2.30508647448
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'right'), 'deadline': 28, 't': 2, 'action': 'right', 'reward': 2.305086474480075, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'right')
Agent followed the waypoint right. (rewarded 2.31)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: None, reward: -5.62937758901
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 27, 't': 3, 'action': None, 'reward': -5.6293775890051005, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.63)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: forward, reward: 2.86645556342
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 2.8664555634210567, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 2.87)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 1.54621922801
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 25, 't': 5, 'action': None, 'reward': 1.5462192280107512, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.55)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 1.12024707778
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 24, 't': 6, 'action': None, 'reward': 1.1202470777810536, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.12)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: forward, reward: -10.4237599932
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': -10.423759993168112, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.42)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 1.44150448217
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 8, 'action': None, 'reward': 1.441504482170212, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.44)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: forward, reward: 1.88644568508
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 1.8864456850774365, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.89)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: 2.30804743668
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 10, 'action': None, 'reward': 2.308047436680618, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.31)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: 1.99668087424
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 19, 't': 11, 'action': None, 'reward': 1.99668087424438, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.00)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: left, reward: -10.0296673538
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 18, 't': 12, 'action': 'left', 'reward': -10.029667353785937, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.03)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: 0.934741472867
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 0.9347414728669825, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 0.93)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: 1.59942700553
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 16, 't': 14, 'action': None, 'reward': 1.5994270055323236, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 1.60)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: 0.915780460685
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 15, 't': 15, 'action': None, 'reward': 0.915780460684757, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.92)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: -5.43831564974
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 14, 't': 16, 'action': None, 'reward': -5.438315649737531, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.44)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: -5.41227575398
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 13, 't': 17, 'action': None, 'reward': -5.412275753977395, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.41)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: forward, reward: 1.51501040411
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 12, 't': 18, 'action': 'forward', 'reward': 1.51501040411115, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 1.52)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: left, reward: 2.0956161586
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 11, 't': 19, 'action': 'left', 'reward': 2.0956161586035766, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.10)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 0.90680091944
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 10, 't': 20, 'action': None, 'reward': 0.9068009194401219, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.91)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 1.31003794536
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 9, 't': 21, 'action': None, 'reward': 1.310037945364928, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.31)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: left, reward: -20.5413940335
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 8, 't': 22, 'action': 'left', 'reward': -20.541394033480085, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.54)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: left, reward: 2.00747881241
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 7, 't': 23, 'action': 'left', 'reward': 2.007478812412623, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 2.01)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: -5.59272833433
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 6, 't': 24, 'action': None, 'reward': -5.592728334331474, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.59)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: right, reward: 1.19396881922
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 5, 't': 25, 'action': 'right', 'reward': 1.1939688192165197, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.19)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: forward, reward: -9.03713613989
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 4, 't': 26, 'action': 'forward', 'reward': -9.03713613988773, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.04)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: None, reward: 2.20029067916
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 3, 't': 27, 'action': None, 'reward': 2.2002906791614114, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.20)
7% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: forward, reward: 1.75537987279
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 2, 't': 28, 'action': 'forward', 'reward': 1.7553798727895262, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.76)
3% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 87
\-------------------------
Environment.reset(): Trial set up with start = (4, 5), destination = (2, 7), deadline = 20
Simulating trial. . .
epsilon = 0.4232; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: left, reward: -10.2112917586
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 0, 'action': 'left', 'reward': -10.211291758594651, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.21)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 4), heading: (0, -1), action: right, reward: 1.71336881835
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.7133688183537799, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 1.71)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 3), heading: (0, -1), action: forward, reward: 1.72980791867
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 1.7298079186747968, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 1.73)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: right, reward: 1.0851706125
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.085170612496071, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent drove right instead of left. (rewarded 1.09)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: right, reward: 1.72901238955
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 1.7290123895484042, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 1.73)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: right, reward: 0.954545226696
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 0.9545452266958192, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 0.95)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 3), heading: (0, -1), action: right, reward: 1.43510396849
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.4351039684851927, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove right instead of forward. (rewarded 1.44)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: left, reward: 1.16609863455
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 1.1660986345542177, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.17)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: -4.22856236268
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 12, 't': 8, 'action': None, 'reward': -4.228562362676114, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -4.23)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: 2.62600839163
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 11, 't': 9, 'action': None, 'reward': 2.6260083916303105, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.63)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: 1.86967899303
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.8696789930346451, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.87)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: right, reward: 1.65558678313
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 1.655586783134501, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 1.66)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: None, reward: 2.17029665493
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 8, 't': 12, 'action': None, 'reward': 2.1702966549296177, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.17)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: right, reward: 1.56602061415
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 7, 't': 13, 'action': 'right', 'reward': 1.566020614150597, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.57)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: None, reward: 1.65854775172
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.6585477517247358, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.66)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: right, reward: 1.31206531879
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 5, 't': 15, 'action': 'right', 'reward': 1.3120653187903848, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.31)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: right, reward: 2.14912872245
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 4, 't': 16, 'action': 'right', 'reward': 2.14912872244619, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 2.15)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: 1.54905709975
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 3, 't': 17, 'action': None, 'reward': 1.5490570997479454, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.55)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: forward, reward: -40.4109638851
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 2, 't': 18, 'action': 'forward', 'reward': -40.41096388508978, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.41)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: right, reward: 0.520833021778
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 1, 't': 19, 'action': 'right', 'reward': 0.5208330217782072, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 0.52)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 88
\-------------------------
Environment.reset(): Trial set up with start = (2, 4), destination = (7, 6), deadline = 25
Simulating trial. . .
epsilon = 0.4190; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: right, reward: 1.68473960047
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.6847396004720514, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.68)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: 2.82025489475
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.820254894750544, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.82)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: 2.87487306958
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.8748730695835265, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.87)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: -5.75671427013
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': -5.7567142701292076, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.76)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: left, reward: 1.73130275493
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 1.7313027549332145, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 1.73)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: left, reward: 1.71459469324
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 1.7145946932393734, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 1.71)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: right, reward: 1.73844121158
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 1.7384412115760064, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.74)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: None, reward: 1.55545562807
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 7, 'action': None, 'reward': 1.5554556280712346, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.56)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: right, reward: 1.41221646723
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 1.4122164672309887, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.41)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: right, reward: -0.0434000009758
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 16, 't': 9, 'action': 'right', 'reward': -0.04340000097583663, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded -0.04)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: -4.25470521514
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 15, 't': 10, 'action': None, 'reward': -4.254705215144907, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.25)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 2.13949539147
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 14, 't': 11, 'action': None, 'reward': 2.1394953914679546, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.14)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: right, reward: 0.296603084665
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 0.29660308466456375, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent drove right instead of left. (rewarded 0.30)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: right, reward: 2.03480531203
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 12, 't': 13, 'action': 'right', 'reward': 2.0348053120348366, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.03)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: right, reward: 1.65594387665
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 11, 't': 14, 'action': 'right', 'reward': 1.655943876645865, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 1.66)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: 1.04882499714
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 10, 't': 15, 'action': None, 'reward': 1.0488249971370411, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.05)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: 2.03158707203
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 9, 't': 16, 'action': 'forward', 'reward': 2.031587072030314, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.03)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 5), heading: (0, -1), action: right, reward: 0.180617848399
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 8, 't': 17, 'action': 'right', 'reward': 0.18061784839858408, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent drove right instead of forward. (rewarded 0.18)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: left, reward: 2.48524543486
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 7, 't': 18, 'action': 'left', 'reward': 2.4852454348622777, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.49)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: forward, reward: 0.553197709356
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 6, 't': 19, 'action': 'forward', 'reward': 0.5531977093556723, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove forward instead of left. (rewarded 0.55)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: left, reward: -10.7135228117
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 5, 't': 20, 'action': 'left', 'reward': -10.71352281166823, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.71)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: left, reward: 1.68739392015
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 4, 't': 21, 'action': 'left', 'reward': 1.6873939201472765, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.69)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: None, reward: 1.0814121023
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 3, 't': 22, 'action': None, 'reward': 1.0814121023015781, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.08)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: left, reward: -39.6224543664
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 2, 't': 23, 'action': 'left', 'reward': -39.62245436635518, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.62)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: forward, reward: -10.96860726
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 1, 't': 24, 'action': 'forward', 'reward': -10.968607260041466, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -10.97)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 89
\-------------------------
Environment.reset(): Trial set up with start = (1, 7), destination = (5, 5), deadline = 30
Simulating trial. . .
epsilon = 0.4148; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 1.66719189008
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 1.6671918900761058, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.67)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: -5.28231357896
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 29, 't': 1, 'action': None, 'reward': -5.282313578964981, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.28)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: -5.63808678711
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': -5.638086787109331, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.64)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: right, reward: 1.67803386356
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 27, 't': 3, 'action': 'right', 'reward': 1.6780338635588121, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 1.68)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: right, reward: 1.498821919
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 26, 't': 4, 'action': 'right', 'reward': 1.4988219189971317, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.50)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: left, reward: 2.65220532834
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 25, 't': 5, 'action': 'left', 'reward': 2.6522053283390408, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.65)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: right, reward: 1.58258503834
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 24, 't': 6, 'action': 'right', 'reward': 1.582585038338749, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 1.58)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: 1.50122904223
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 23, 't': 7, 'action': None, 'reward': 1.5012290422345695, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.50)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: right, reward: 1.34749639764
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 8, 'action': 'right', 'reward': 1.3474963976353473, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 1.35)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: left, reward: 1.55691202613
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 21, 't': 9, 'action': 'left', 'reward': 1.5569120261320721, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 1.56)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: 1.23136094287
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 1.2313609428671275, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.23)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: forward, reward: 2.62232496802
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 11, 'action': 'forward', 'reward': 2.6223249680249676, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.62)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 1.50385355778
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 12, 'action': None, 'reward': 1.5038535577822636, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.50)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: right, reward: 0.297514544378
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 17, 't': 13, 'action': 'right', 'reward': 0.29751454437813907, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent drove right instead of left. (rewarded 0.30)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: left, reward: -10.2271787034
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', 'right'), 'deadline': 16, 't': 14, 'action': 'left', 'reward': -10.227178703416579, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'right')
Agent attempted driving left through a red light. (rewarded -10.23)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: right, reward: 1.96889982477
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 15, 't': 15, 'action': 'right', 'reward': 1.9688998247735532, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.97)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: None, reward: -0.228280037032
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 14, 't': 16, 'action': None, 'reward': -0.22828003703178212, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded -0.23)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: right, reward: 1.48771538064
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 13, 't': 17, 'action': 'right', 'reward': 1.487715380638202, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.49)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: forward, reward: -40.7589881056
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'forward', 'forward'), 'deadline': 12, 't': 18, 'action': 'forward', 'reward': -40.75898810557248, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.76)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: -4.19183908435
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 11, 't': 19, 'action': None, 'reward': -4.19183908434729, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.19)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: right, reward: 2.19150703081
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 10, 't': 20, 'action': 'right', 'reward': 2.1915070308118096, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent followed the waypoint right. (rewarded 2.19)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: right, reward: 1.51246373611
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 9, 't': 21, 'action': 'right', 'reward': 1.5124637361143298, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent drove right instead of left. (rewarded 1.51)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: right, reward: 0.630386133778
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 8, 't': 22, 'action': 'right', 'reward': 0.6303861337778414, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 0.63)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: forward, reward: -9.33500510756
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 7, 't': 23, 'action': 'forward', 'reward': -9.335005107556336, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.34)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: right, reward: 1.97463357852
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 6, 't': 24, 'action': 'right', 'reward': 1.9746335785205307, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 1.97)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: left, reward: 0.572328108953
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 5, 't': 25, 'action': 'left', 'reward': 0.5723281089528351, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 0.57)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: left, reward: 0.994981195814
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 4, 't': 26, 'action': 'left', 'reward': 0.9949811958137309, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 0.99)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: right, reward: 0.397113513303
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 3, 't': 27, 'action': 'right', 'reward': 0.39711351330260514, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.40)
7% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: left, reward: 0.470298176542
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 2, 't': 28, 'action': 'left', 'reward': 0.4702981765419455, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 0.47)
3% of time remaining to reach destination.
/-------------------
| Step 29 Results
\-------------------
Environment.step(): t = 29
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: -5.01075407545
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 1, 't': 29, 'action': None, 'reward': -5.0107540754520485, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.01)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 90
\-------------------------
Environment.reset(): Trial set up with start = (2, 6), destination = (6, 7), deadline = 25
Simulating trial. . .
epsilon = 0.4107; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: right, reward: 0.688721321465
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 0.6887213214647381, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.69)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: None, reward: 2.38398427111
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.3839842711119976, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.38)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: None, reward: 1.38249923028
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.3824992302760275, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.38)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: forward, reward: -9.64533452412
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': -9.645334524123356, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.65)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: left, reward: -20.4619923919
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 21, 't': 4, 'action': 'left', 'reward': -20.4619923919229, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.46)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: None, reward: -4.16382546631
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': None, 'reward': -4.16382546630816, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.16)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: left, reward: 2.52770207301
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'left', 'reward': 2.5277020730094195, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.53)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: right, reward: 0.449472407573
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 0.4494724075726637, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.45)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: None, reward: 2.85818662541
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.8581866254131354, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.86)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: left, reward: 0.886664053698
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'left', 'reward': 0.8866640536979902, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.89)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: None, reward: 1.75852413093
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.7585241309349482, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.76)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: right, reward: 0.457321669478
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 0.4573216694778478, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 0.46)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: forward, reward: 0.254089802886
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 13, 't': 12, 'action': 'forward', 'reward': 0.2540898028861105, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent drove forward instead of left. (rewarded 0.25)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: None, reward: 2.57749199149
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 12, 't': 13, 'action': None, 'reward': 2.577491991493612, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.58)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: None, reward: 2.00903194888
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 11, 't': 14, 'action': None, 'reward': 2.009031948878388, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.01)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: None, reward: 1.14470481556
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 10, 't': 15, 'action': None, 'reward': 1.1447048155580872, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.14)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: left, reward: 2.19259650319
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 9, 't': 16, 'action': 'left', 'reward': 2.192596503190075, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.19)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: None, reward: 1.10332132199
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 8, 't': 17, 'action': None, 'reward': 1.103321321994163, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.10)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: right, reward: -19.4484672646
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 7, 't': 18, 'action': 'right', 'reward': -19.448467264641707, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.45)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: forward, reward: 1.98054152594
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 6, 't': 19, 'action': 'forward', 'reward': 1.9805415259434476, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 1.98)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: left, reward: 0.0199339343958
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 5, 't': 20, 'action': 'left', 'reward': 0.019933934395761188, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 0.02)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: right, reward: 1.44033126395
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 4, 't': 21, 'action': 'right', 'reward': 1.440331263946906, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.44)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: -5.32046042664
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 3, 't': 22, 'action': None, 'reward': -5.32046042663772, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.32)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: left, reward: -19.8273107639
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 2, 't': 23, 'action': 'left', 'reward': -19.82731076391991, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.83)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: right, reward: 1.01411759091
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 1, 't': 24, 'action': 'right', 'reward': 1.0141175909096813, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.01)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 91
\-------------------------
Environment.reset(): Trial set up with start = (3, 7), destination = (8, 4), deadline = 30
Simulating trial. . .
epsilon = 0.4066; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: forward, reward: 1.9685008475
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'forward'), 'deadline': 30, 't': 0, 'action': 'forward', 'reward': 1.9685008474964962, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'forward')
Agent drove forward instead of right. (rewarded 1.97)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: None, reward: 1.80863538763
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.8086353876293542, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.81)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: right, reward: 2.47392811711
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 28, 't': 2, 'action': 'right', 'reward': 2.4739281171094154, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.47)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: forward, reward: 0.231025378545
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'right'), 'deadline': 27, 't': 3, 'action': 'forward', 'reward': 0.2310253785450581, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'right')
Agent drove forward instead of right. (rewarded 0.23)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: forward, reward: 0.804189264663
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'right'), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 0.8041892646627004, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'right')
Agent drove forward instead of right. (rewarded 0.80)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: right, reward: 1.13799405533
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 25, 't': 5, 'action': 'right', 'reward': 1.1379940553349994, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.14)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 2.7866311126
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.7866311125991503, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.79)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 1.34513753463
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 23, 't': 7, 'action': None, 'reward': 1.3451375346279781, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.35)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: left, reward: -10.5414077682
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 8, 'action': 'left', 'reward': -10.541407768200393, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.54)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: forward, reward: 2.5538396219
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 2.5538396218997583, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 2.55)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: None, reward: 1.26084259611
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 20, 't': 10, 'action': None, 'reward': 1.2608425961081988, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.26)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: right, reward: 1.62887798709
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 1.6288779870877081, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 1.63)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: right, reward: 1.10224612919
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 12, 'action': 'right', 'reward': 1.1022461291879906, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 1.10)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: right, reward: 1.53030768746
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 17, 't': 13, 'action': 'right', 'reward': 1.5303076874617891, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.53)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: forward, reward: 0.462306891132
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 16, 't': 14, 'action': 'forward', 'reward': 0.4623068911323034, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent drove forward instead of right. (rewarded 0.46)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: right, reward: 2.30458350826
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 15, 't': 15, 'action': 'right', 'reward': 2.304583508256857, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.30)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: forward, reward: -40.568086808
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'left'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'right', 'left'), 'deadline': 14, 't': 16, 'action': 'forward', 'reward': -40.56808680795595, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'left')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.57)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: forward, reward: -10.1135970893
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 13, 't': 17, 'action': 'forward', 'reward': -10.113597089274437, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -10.11)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 2.49679565804
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 12, 't': 18, 'action': None, 'reward': 2.496795658042779, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.50)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: forward, reward: 2.02841154883
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 11, 't': 19, 'action': 'forward', 'reward': 2.0284115488313468, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.03)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 1.71086422245
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 10, 't': 20, 'action': 'forward', 'reward': 1.7108642224495638, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.71)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: right, reward: 2.37433366133
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 9, 't': 21, 'action': 'right', 'reward': 2.374333661330371, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.37)
27% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 92
\-------------------------
Environment.reset(): Trial set up with start = (7, 6), destination = (6, 3), deadline = 20
Simulating trial. . .
epsilon = 0.4025; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: left, reward: -10.1905398215
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': -10.190539821496603, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -10.19)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: right, reward: 1.09764005059
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.0976400505897543, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.10)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: None, reward: 2.28203691492
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.2820369149160005, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.28)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: left, reward: -9.82932099776
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': -9.829320997755545, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.83)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: forward, reward: 0.133437549697
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 0.13343754969726274, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove forward instead of left. (rewarded 0.13)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: left, reward: 1.41657723247
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 1.4165772324698072, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 1.42)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: forward, reward: -10.0459656243
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': -10.045965624303367, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.05)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: None, reward: 1.67754983032
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.67754983031989, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.68)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: right, reward: 1.31049751295
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.3104975129536078, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent drove right instead of left. (rewarded 1.31)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: left, reward: 1.383237842
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 11, 't': 9, 'action': 'left', 'reward': 1.38323784199563, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.38)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: None, reward: 1.63397080199
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.6339708019889476, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.63)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: forward, reward: 0.0410648982004
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 0.04106489820042092, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove forward instead of left. (rewarded 0.04)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: None, reward: 1.91590324683
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 8, 't': 12, 'action': None, 'reward': 1.9159032468284944, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.92)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: left, reward: 2.21103519941
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 7, 't': 13, 'action': 'left', 'reward': 2.2110351994147903, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.21)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: forward, reward: 0.67069601781
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 6, 't': 14, 'action': 'forward', 'reward': 0.6706960178097008, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 0.67)
25% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 93
\-------------------------
Environment.reset(): Trial set up with start = (5, 4), destination = (1, 7), deadline = 35
Simulating trial. . .
epsilon = 0.3985; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: None, reward: 0.451213205903
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 35, 't': 0, 'action': None, 'reward': 0.45121320590270053, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 0.45)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: right, reward: 1.20722248505
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 34, 't': 1, 'action': 'right', 'reward': 1.2072224850464488, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.21)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: right, reward: 1.260133599
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 33, 't': 2, 'action': 'right', 'reward': 1.2601335990025901, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.26)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: forward, reward: 1.53644247471
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 32, 't': 3, 'action': 'forward', 'reward': 1.5364424747069239, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.54)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: None, reward: 2.28383758462
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 31, 't': 4, 'action': None, 'reward': 2.283837584624607, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.28)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: left, reward: 0.676672868646
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 30, 't': 5, 'action': 'left', 'reward': 0.6766728686463748, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 0.68)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: forward, reward: -10.3035220669
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 29, 't': 6, 'action': 'forward', 'reward': -10.30352206686366, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.30)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: left, reward: 1.38935358913
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 28, 't': 7, 'action': 'left', 'reward': 1.3893535891290067, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 1.39)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: forward, reward: -9.34682169899
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 27, 't': 8, 'action': 'forward', 'reward': -9.346821698988547, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent attempted driving forward through a red light. (rewarded -9.35)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: None, reward: -4.05814242927
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 26, 't': 9, 'action': None, 'reward': -4.0581424292721815, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.06)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: left, reward: -0.0718435012038
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 25, 't': 10, 'action': 'left', 'reward': -0.0718435012037516, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded -0.07)
69% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: None, reward: 0.973747331371
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 24, 't': 11, 'action': None, 'reward': 0.9737473313705709, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.97)
66% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: None, reward: 2.53242387158
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 12, 'action': None, 'reward': 2.5324238715802587, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.53)
63% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: left, reward: 2.57751722226
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 22, 't': 13, 'action': 'left', 'reward': 2.5775172222614815, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.58)
60% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: forward, reward: 2.41891471657
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 21, 't': 14, 'action': 'forward', 'reward': 2.4189147165684095, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.42)
57% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 1.79097253169
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 20, 't': 15, 'action': 'forward', 'reward': 1.7909725316940308, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.79)
54% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: left, reward: 1.05465828106
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 19, 't': 16, 'action': 'left', 'reward': 1.0546582810633869, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 1.05)
51% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: forward, reward: -10.5752749627
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 17, 'action': 'forward', 'reward': -10.575274962676453, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.58)
49% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: None, reward: 1.05527084381
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 18, 'action': None, 'reward': 1.0552708438082457, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.06)
46% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: forward, reward: 1.41938895585
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 16, 't': 19, 'action': 'forward', 'reward': 1.4193889558451818, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.42)
43% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 94
\-------------------------
Environment.reset(): Trial set up with start = (4, 3), destination = (8, 5), deadline = 30
Simulating trial. . .
epsilon = 0.3946; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: None, reward: 2.52334724536
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 30, 't': 0, 'action': None, 'reward': 2.523347245360635, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.52)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: None, reward: 2.60413483115
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.604134831154603, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.60)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: right, reward: 1.82823697645
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 28, 't': 2, 'action': 'right', 'reward': 1.8282369764540194, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent drove right instead of forward. (rewarded 1.83)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: right, reward: 1.83622512444
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 27, 't': 3, 'action': 'right', 'reward': 1.8362251244376724, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.84)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: -4.09919299556
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 26, 't': 4, 'action': None, 'reward': -4.099192995558145, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.10)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: forward, reward: 2.14196869531
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 2.1419686953085924, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.14)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: None, reward: 2.72384888021
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.723848880205563, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.72)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: forward, reward: 1.12214167974
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 1.122141679744982, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.12)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: 2.16598076601
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 8, 'action': 'forward', 'reward': 2.165980766009847, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.17)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: 1.98119389007
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 21, 't': 9, 'action': None, 'reward': 1.9811938900738673, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.98)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: right, reward: 1.06334666457
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'left'), 'deadline': 20, 't': 10, 'action': 'right', 'reward': 1.0633466645702943, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'left')
Agent drove right instead of left. (rewarded 1.06)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: forward, reward: 1.98473716406
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 11, 'action': 'forward', 'reward': 1.9847371640553901, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.98)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: forward, reward: -39.6561402957
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 18, 't': 12, 'action': 'forward', 'reward': -39.65614029566145, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.66)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 5), heading: (0, 1), action: forward, reward: 1.26539383698
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 1.2653938369768576, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.27)
53% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 95
\-------------------------
Environment.reset(): Trial set up with start = (5, 3), destination = (1, 2), deadline = 25
Simulating trial. . .
epsilon = 0.3906; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: left, reward: -9.42619448959
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 25, 't': 0, 'action': 'left', 'reward': -9.426194489592277, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -9.43)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: right, reward: 1.59802000773
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.5980200077349123, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 1.60)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 2.29363981329
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.2936398132945284, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.29)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 1.51680903527
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.5168090352716626, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.52)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 2.84304190977
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.8430419097650192, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.84)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 2.71931387941
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.719313879405566, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.72)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: left, reward: 1.25266505034
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 19, 't': 6, 'action': 'left', 'reward': 1.2526650503440018, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.25)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: forward, reward: 1.74832103908
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.748321039081006, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.75)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: forward, reward: 1.19529415617
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 1.1952941561674286, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.20)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: forward, reward: -10.3073525112
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': -10.307352511243806, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.31)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 5), heading: (0, 1), action: right, reward: 1.21560075398
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 1.2156007539836873, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 1.22)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: right, reward: 1.27061888032
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 1.2706188803162206, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.27)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: None, reward: 2.22487277752
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 13, 't': 12, 'action': None, 'reward': 2.2248727775222443, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.22)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: -10.6778225028
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': -10.677822502829049, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.68)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: left, reward: -9.19476176484
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 11, 't': 14, 'action': 'left', 'reward': -9.19476176484116, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -9.19)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: None, reward: 1.74908113797
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 10, 't': 15, 'action': None, 'reward': 1.7490811379680755, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.75)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: left, reward: 1.52714451397
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 9, 't': 16, 'action': 'left', 'reward': 1.5271445139718536, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.53)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: None, reward: 1.81924859663
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 8, 't': 17, 'action': None, 'reward': 1.8192485966316412, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.82)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: left, reward: -10.3328196087
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 7, 't': 18, 'action': 'left', 'reward': -10.332819608722868, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -10.33)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: None, reward: 2.44329610738
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 6, 't': 19, 'action': None, 'reward': 2.443296107375267, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.44)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: forward, reward: -39.1762765596
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 5, 't': 20, 'action': 'forward', 'reward': -39.17627655957857, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.18)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: left, reward: 1.02564149881
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 4, 't': 21, 'action': 'left', 'reward': 1.0256414988050406, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.03)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 1.46255591847
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 3, 't': 22, 'action': None, 'reward': 1.4625559184708323, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.46)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 0.211496751857
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 2, 't': 23, 'action': None, 'reward': 0.21149675185667793, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.21)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 1.43250108152
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 1, 't': 24, 'action': None, 'reward': 1.4325010815200867, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.43)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 96
\-------------------------
Environment.reset(): Trial set up with start = (1, 7), destination = (7, 5), deadline = 20
Simulating trial. . .
epsilon = 0.3867; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: None, reward: 1.09645578891
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.0964557889135267, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.10)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: None, reward: 2.40663813753
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.4066381375315222, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.41)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: None, reward: 1.32895981675
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.328959816747989, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.33)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: forward, reward: -0.00755587362296
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': -0.0075558736229616175, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded -0.01)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: None, reward: 1.92080164227
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.9208016422700507, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.92)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: None, reward: 2.25395065165
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.2539506516521923, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.25)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: None, reward: 0.997033757138
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 0.9970337571376582, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.00)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: forward, reward: 0.148335318358
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 0.14833531835848257, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove forward instead of left. (rewarded 0.15)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 1.54073167776
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.540731677757392, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.54)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: left, reward: 1.48543500264
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'left', 'reward': 1.485435002642741, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.49)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: None, reward: 1.31702470351
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.3170247035060234, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.32)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: None, reward: 2.20811481801
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 9, 't': 11, 'action': None, 'reward': 2.2081148180126116, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.21)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: left, reward: -39.0767381523
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 8, 't': 12, 'action': 'left', 'reward': -39.07673815232115, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.08)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 1.50678305979
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 1.5067830597859733, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 1.51)
30% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 97
\-------------------------
Environment.reset(): Trial set up with start = (1, 2), destination = (2, 5), deadline = 20
Simulating trial. . .
epsilon = 0.3829; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 1.54655630652
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.5465563065162369, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 1.55)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 1.37376106174
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.373761061736303, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.37)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 1.66192997892
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.661929978924023, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.66)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: left, reward: 1.83388994715
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 1.833889947146214, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded 1.83)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: right, reward: 1.70657995656
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 1.7065799565616817, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.71)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: forward, reward: 1.14165293601
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.1416529360084313, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 1.14)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: left, reward: 1.25043931764
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 1.2504393176365014, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.25)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: None, reward: -4.61478811377
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': -4.614788113774542, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.61)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: right, reward: 1.01607270782
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.0160727078193559, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove right instead of left. (rewarded 1.02)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: None, reward: 2.10100871056
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 11, 't': 9, 'action': None, 'reward': 2.1010087105646553, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.10)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: left, reward: 1.05929454748
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 10, 't': 10, 'action': 'left', 'reward': 1.0592945474815656, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 1.06)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: None, reward: 2.53633469237
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 9, 't': 11, 'action': None, 'reward': 2.5363346923674954, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.54)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: None, reward: 2.35157731406
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 8, 't': 12, 'action': None, 'reward': 2.351577314061745, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.35)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: right, reward: 1.32082320508
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 7, 't': 13, 'action': 'right', 'reward': 1.320823205083451, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.32)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: None, reward: 0.00655871174045
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 6, 't': 14, 'action': None, 'reward': 0.006558711740451861, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.01)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: right, reward: 0.767949190156
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 5, 't': 15, 'action': 'right', 'reward': 0.7679491901555975, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent followed the waypoint right. (rewarded 0.77)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: None, reward: -0.401919127677
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 4, 't': 16, 'action': None, 'reward': -0.4019191276768259, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded -0.40)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: right, reward: 1.00292775007
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 3, 't': 17, 'action': 'right', 'reward': 1.002927750071828, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.00)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: forward, reward: 0.933173089419
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 2, 't': 18, 'action': 'forward', 'reward': 0.9331730894189227, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 0.93)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: forward, reward: -10.4617065294
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 1, 't': 19, 'action': 'forward', 'reward': -10.46170652942906, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.46)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 98
\-------------------------
Environment.reset(): Trial set up with start = (8, 3), destination = (4, 3), deadline = 20
Simulating trial. . .
epsilon = 0.3791; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: forward, reward: 1.69919180973
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'right'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.6991918097310847, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'right')
Agent drove forward instead of left. (rewarded 1.70)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: None, reward: -4.4902892915
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': None, 'reward': -4.490289291504285, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.49)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: left, reward: 1.31828945128
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'left', 'reward': 1.3182894512778285, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.32)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: right, reward: 0.85805588247
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 0.858055882469528, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.86)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: left, reward: 1.13736967836
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 1.1373696783620506, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.14)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: left, reward: 1.07305543639
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 1.0730554363946456, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 1.07)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: right, reward: 1.97127154095
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.9712715409500492, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.97)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 1.1905044055
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.1905044055036766, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.19)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 1.42630473735
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.426304737347902, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.43)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: forward, reward: 1.09248400468
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 1.0924840046773283, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.09)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: right, reward: 0.486930362737
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 0.48693036273681034, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.49)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 5), heading: (-1, 0), action: right, reward: 1.36450890249
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 1.3645089024880246, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.36)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 5), heading: (-1, 0), action: left, reward: -9.31846986434
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 8, 't': 12, 'action': 'left', 'reward': -9.318469864339583, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.32)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: left, reward: 0.46028764535
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 7, 't': 13, 'action': 'left', 'reward': 0.4602876453498531, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 0.46)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: right, reward: 0.0705252550499
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 6, 't': 14, 'action': 'right', 'reward': 0.07052525504991747, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.07)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: forward, reward: -10.6950549918
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': -10.695054991815653, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.70)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 2.00501233581
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 4, 't': 16, 'action': None, 'reward': 2.005012335808603, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.01)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: forward, reward: -0.341772463515
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 3, 't': 17, 'action': 'forward', 'reward': -0.34177246351466384, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove forward instead of left. (rewarded -0.34)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: -0.0733265269577
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 2, 't': 18, 'action': 'forward', 'reward': -0.0733265269576786, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded -0.07)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: left, reward: 1.57923970889
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 1, 't': 19, 'action': 'left', 'reward': 1.579239708885853, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.58)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 99
\-------------------------
Environment.reset(): Trial set up with start = (7, 7), destination = (1, 5), deadline = 20
Simulating trial. . .
epsilon = 0.3753; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: left, reward: 2.08530295715
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 2.0853029571508332, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.09)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: left, reward: 0.821918285477
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': 0.8219182854767461, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent drove left instead of forward. (rewarded 0.82)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: forward, reward: -10.5022991911
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': -10.502299191057377, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.50)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: right, reward: 1.90710259718
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.907102597184556, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.91)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: left, reward: 1.34098222542
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 1.3409822254179407, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.34)
75% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 100
\-------------------------
Environment.reset(): Trial set up with start = (3, 2), destination = (7, 3), deadline = 25
Simulating trial. . .
epsilon = 0.3716; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: forward, reward: 2.27652067246
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 2.276520672462553, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.28)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 1.49619639786
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.4961963978559443, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.50)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 2.3832150494
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.3832150493998743, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.38)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 1.22895371814
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.2289537181444299, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.23)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 2.13272435074
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.132724350736883, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.13)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: left, reward: -40.4816165832
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': -40.48161658323546, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.48)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: right, reward: 1.21162861177
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 1.211628611765248, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 1.21)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: left, reward: 1.07538362096
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 1.0753836209632135, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.08)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: left, reward: 1.84243180636
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'left', 'reward': 1.8424318063566472, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.84)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: forward, reward: 2.63049857067
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 2.6304985706735864, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.63)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 101
\-------------------------
Environment.reset(): Trial set up with start = (4, 7), destination = (1, 6), deadline = 20
Simulating trial. . .
epsilon = 0.3679; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: None, reward: 2.43838343815
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'forward'), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.4383834381476093, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 2.44)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: forward, reward: -10.6066393149
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': -10.606639314885928, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.61)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: None, reward: 2.28469650382
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.284696503823724, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.28)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: forward, reward: -9.36424401739
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': -9.364244017390375, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.36)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: right, reward: 0.619110520652
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 0.6191105206524653, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.62)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: right, reward: 0.0798336106963
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 0.07983361069628869, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent drove right instead of forward. (rewarded 0.08)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: None, reward: 2.20183631389
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.201836313893504, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.20)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: left, reward: 2.23923729573
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 2.2392372957317916, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.24)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: forward, reward: 2.74573805338
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 2.7457380533765967, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.75)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: None, reward: 0.864531897681
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 11, 't': 9, 'action': None, 'reward': 0.8645318976806315, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 0.86)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: None, reward: 1.43109119227
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.4310911922701246, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 1.43)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: 1.39031613829
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 1.390316138290242, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.39)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: forward, reward: 0.820865589711
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': 0.8208655897108283, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 0.82)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: left, reward: -9.1019119795
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 7, 't': 13, 'action': 'left', 'reward': -9.101911979496307, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -9.10)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 2.27838398368
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 6, 't': 14, 'action': None, 'reward': 2.278383983683587, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.28)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: left, reward: 1.16466876997
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 5, 't': 15, 'action': 'left', 'reward': 1.1646687699681528, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.16)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: forward, reward: 1.84892547258
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 4, 't': 16, 'action': 'forward', 'reward': 1.8489254725780078, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.85)
15% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 102
\-------------------------
Environment.reset(): Trial set up with start = (6, 6), destination = (2, 3), deadline = 35
Simulating trial. . .
epsilon = 0.3642; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: forward, reward: 0.961988923864
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'forward'), 'deadline': 35, 't': 0, 'action': 'forward', 'reward': 0.9619889238641137, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'forward')
Agent drove forward instead of left. (rewarded 0.96)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 1.45182343133
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 34, 't': 1, 'action': None, 'reward': 1.4518234313328797, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.45)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: forward, reward: -9.06220992392
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 33, 't': 2, 'action': 'forward', 'reward': -9.0622099239171, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -9.06)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 1.60113949775
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 32, 't': 3, 'action': None, 'reward': 1.6011394977505893, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.60)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 2.49595030224
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 31, 't': 4, 'action': None, 'reward': 2.495950302237621, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.50)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: forward, reward: -9.47337860602
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 30, 't': 5, 'action': 'forward', 'reward': -9.473378606016814, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -9.47)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: forward, reward: 2.66762654849
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 29, 't': 6, 'action': 'forward', 'reward': 2.6676265484940798, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.67)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: forward, reward: 2.8873985663
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 28, 't': 7, 'action': 'forward', 'reward': 2.88739856630009, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.89)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: None, reward: 2.29667619648
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 27, 't': 8, 'action': None, 'reward': 2.2966761964846736, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.30)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: forward, reward: 1.8765594632
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 26, 't': 9, 'action': 'forward', 'reward': 1.876559463204682, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.88)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 2.17541201508
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 25, 't': 10, 'action': None, 'reward': 2.1754120150811387, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.18)
69% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 2.25152272778
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 24, 't': 11, 'action': None, 'reward': 2.251522727775729, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.25)
66% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 1.75236147441
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 12, 'action': None, 'reward': 1.7523614744082747, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.75)
63% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: left, reward: 0.864586322465
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 22, 't': 13, 'action': 'left', 'reward': 0.8645863224650954, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.86)
60% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: None, reward: 1.84036834516
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 14, 'action': None, 'reward': 1.8403683451649024, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.84)
57% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: right, reward: 0.994669902916
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 15, 'action': 'right', 'reward': 0.9946699029159168, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.99)
54% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: 1.13212391532
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 16, 'action': None, 'reward': 1.132123915318106, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.13)
51% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: right, reward: 1.06469914185
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 18, 't': 17, 'action': 'right', 'reward': 1.064699141851242, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 1.06)
49% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: right, reward: 1.8156232862
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 17, 't': 18, 'action': 'right', 'reward': 1.815623286195803, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.82)
46% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: right, reward: 1.28379332688
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 16, 't': 19, 'action': 'right', 'reward': 1.2837933268795712, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.28)
43% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: right, reward: 0.629454654069
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 15, 't': 20, 'action': 'right', 'reward': 0.6294546540690293, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 0.63)
40% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: forward, reward: -0.32857957816
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 14, 't': 21, 'action': 'forward', 'reward': -0.32857957815956884, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded -0.33)
37% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: right, reward: 0.60092190877
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 13, 't': 22, 'action': 'right', 'reward': 0.6009219087700479, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.60)
34% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: right, reward: 2.25792052431
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 12, 't': 23, 'action': 'right', 'reward': 2.2579205243122864, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.26)
31% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 2.08399754962
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 11, 't': 24, 'action': None, 'reward': 2.083997549620248, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.08)
29% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: 0.539882890176
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 10, 't': 25, 'action': 'forward', 'reward': 0.5398828901757142, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.54)
26% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: -9.87030988507
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 9, 't': 26, 'action': 'forward', 'reward': -9.870309885065545, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent attempted driving forward through a red light. (rewarded -9.87)
23% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: -39.1474986834
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'forward', 'forward'), 'deadline': 8, 't': 27, 'action': 'forward', 'reward': -39.14749868343947, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.15)
20% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: left, reward: -9.53804767336
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 7, 't': 28, 'action': 'left', 'reward': -9.538047673363284, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.54)
17% of time remaining to reach destination.
/-------------------
| Step 29 Results
\-------------------
Environment.step(): t = 29
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: right, reward: 1.55869338578
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 6, 't': 29, 'action': 'right', 'reward': 1.558693385778879, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.56)
14% of time remaining to reach destination.
/-------------------
| Step 30 Results
\-------------------
Environment.step(): t = 30
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: right, reward: 0.60936048928
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 5, 't': 30, 'action': 'right', 'reward': 0.6093604892801794, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent drove right instead of forward. (rewarded 0.61)
11% of time remaining to reach destination.
/-------------------
| Step 31 Results
\-------------------
Environment.step(): t = 31
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: 1.89289380198
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 4, 't': 31, 'action': None, 'reward': 1.8928938019787227, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.89)
9% of time remaining to reach destination.
/-------------------
| Step 32 Results
\-------------------
Environment.step(): t = 32
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: 0.310629600774
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 3, 't': 32, 'action': None, 'reward': 0.3106296007741236, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.31)
6% of time remaining to reach destination.
/-------------------
| Step 33 Results
\-------------------
Environment.step(): t = 33
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: -4.80723220096
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 2, 't': 33, 'action': None, 'reward': -4.807232200960117, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.81)
3% of time remaining to reach destination.
/-------------------
| Step 34 Results
\-------------------
Environment.step(): t = 34
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: left, reward: 1.99680011384
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 1, 't': 34, 'action': 'left', 'reward': 1.9968001138366818, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.00)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 103
\-------------------------
Environment.reset(): Trial set up with start = (2, 6), destination = (5, 7), deadline = 20
Simulating trial. . .
epsilon = 0.3606; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: None, reward: -4.14999505928
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 20, 't': 0, 'action': None, 'reward': -4.149995059275806, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.15)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: forward, reward: 1.80495631877
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 1.804956318766905, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove forward instead of left. (rewarded 1.80)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: left, reward: 1.54406981146
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': 1.5440698114582612, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.54)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: -4.02190419179
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': -4.021904191794821, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.02)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: forward, reward: 1.21205254937
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.212052549368296, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.21)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: None, reward: 1.1530688338
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.153068833801621, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.15)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: None, reward: 1.16452980588
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.1645298058751548, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.16)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: left, reward: 1.33146471238
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 1.3314647123847534, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.33)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: right, reward: 2.56112637701
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 2.561126377010359, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.56)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: right, reward: 1.81103378064
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'left'), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 1.8110337806437995, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'left')
Agent followed the waypoint right. (rewarded 1.81)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 104
\-------------------------
Environment.reset(): Trial set up with start = (7, 3), destination = (2, 2), deadline = 20
Simulating trial. . .
epsilon = 0.3570; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: left, reward: -10.4614529834
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': -10.461452983413999, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -10.46)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: None, reward: 0.273075995223
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 0.2730759952233569, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.27)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: None, reward: 0.17633442309
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 0.17633442308986502, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.18)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: right, reward: 1.09137634145
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.091376341452375, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.09)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: -4.97347943741
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 16, 't': 4, 'action': None, 'reward': -4.973479437414666, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.97)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 1.56691425226
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.5669142522594814, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 1.57)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: left, reward: -40.931628054
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 14, 't': 6, 'action': 'left', 'reward': -40.93162805399119, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.93)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 1.66154141242
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.6615414124170722, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.66)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 2.16204086057
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.1620408605735104, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.16)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 0.944529130372
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 0.9445291303718548, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.94)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 1.08612859607
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.086128596067852, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.09)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: -5.83348378527
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 9, 't': 11, 'action': None, 'reward': -5.833483785274454, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.83)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 2.54052210955
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': 2.5405221095463117, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 2.54)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: -5.65510691885
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 7, 't': 13, 'action': None, 'reward': -5.655106918851258, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.66)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: left, reward: 1.11324486707
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 6, 't': 14, 'action': 'left', 'reward': 1.1132448670658972, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.11)
25% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 105
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (4, 2), deadline = 25
Simulating trial. . .
epsilon = 0.3535; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: None, reward: 1.88071359475
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.8807135947486977, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.88)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: None, reward: 1.96081874578
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.9608187457766135, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.96)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: right, reward: 1.17424427171
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.1742442717076536, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.17)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: right, reward: 0.958846339127
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 22, 't': 3, 'action': 'right', 'reward': 0.95884633912744, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent drove right instead of forward. (rewarded 0.96)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: forward, reward: 1.38722869531
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.387228695306518, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove forward instead of left. (rewarded 1.39)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: right, reward: 1.80627751514
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 1.8062775151355854, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 1.81)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: right, reward: 2.34553899205
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 2.3455389920457135, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.35)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: right, reward: 1.39017942368
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 1.390179423675822, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.39)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: forward, reward: 1.3121508268
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 1.3121508268021427, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.31)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: forward, reward: 1.95453548538
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 1.9545354853833818, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.95)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: right, reward: 1.3470822447
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 1.3470822447008812, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.35)
56% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 106
\-------------------------
Environment.reset(): Trial set up with start = (2, 6), destination = (6, 7), deadline = 25
Simulating trial. . .
epsilon = 0.3499; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: -5.65855397682
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 25, 't': 0, 'action': None, 'reward': -5.658553976815821, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -5.66)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: left, reward: 1.83632219966
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 24, 't': 1, 'action': 'left', 'reward': 1.8363221996608867, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 1.84)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: left, reward: 2.52435030738
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 23, 't': 2, 'action': 'left', 'reward': 2.5243503073766194, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.52)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 2.96234380152
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.962343801516856, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.96)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 1.67340675967
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.6734067596745663, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.67)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 2.61742192157
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 2.6174219215733032, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.62)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: right, reward: 0.857717105575
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 0.8577171055752709, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.86)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: left, reward: 2.05810851862
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 2.058108518615109, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.06)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: forward, reward: 1.06481249112
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 1.0648124911190353, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.06)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: forward, reward: -0.115727501954
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': -0.1157275019541899, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent drove forward instead of right. (rewarded -0.12)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: right, reward: 2.24804060121
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 2.2480406012122685, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.25)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: None, reward: -0.0693101366458
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 14, 't': 11, 'action': None, 'reward': -0.06931013664583563, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded -0.07)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: right, reward: 2.37142015032
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 2.371420150315113, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.37)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: left, reward: -10.3387504479
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 12, 't': 13, 'action': 'left', 'reward': -10.33875044790841, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.34)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: None, reward: 2.18854343363
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'right'), 'deadline': 11, 't': 14, 'action': None, 'reward': 2.188543433627938, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 2.19)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (6, 4), heading: (0, 1), action: right, reward: -0.0224830693152
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 10, 't': 15, 'action': 'right', 'reward': -0.02248306931524735, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded -0.02)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: right, reward: 1.19190234542
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 9, 't': 16, 'action': 'right', 'reward': 1.1919023454155542, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.19)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: right, reward: 0.930894317327
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'left'), 'deadline': 8, 't': 17, 'action': 'right', 'reward': 0.930894317326789, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'left')
Agent followed the waypoint right. (rewarded 0.93)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: left, reward: 1.33427975784
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 7, 't': 18, 'action': 'left', 'reward': 1.3342797578407606, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 1.33)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: right, reward: 2.22135243647
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 6, 't': 19, 'action': 'right', 'reward': 2.2213524364677273, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.22)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: right, reward: 1.5119238051
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 5, 't': 20, 'action': 'right', 'reward': 1.511923805099686, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.51)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: 0.812067816644
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 4, 't': 21, 'action': None, 'reward': 0.8120678166444091, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.81)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: left, reward: 1.03328421559
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 3, 't': 22, 'action': 'left', 'reward': 1.0332842155900128, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.03)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: right, reward: 1.94430663689
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 2, 't': 23, 'action': 'right', 'reward': 1.944306636886872, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.94)
4% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 107
\-------------------------
Environment.reset(): Trial set up with start = (6, 3), destination = (1, 7), deadline = 25
Simulating trial. . .
epsilon = 0.3465; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: forward, reward: 2.75397846978
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 2.7539784697783825, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.75)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: forward, reward: -9.2967927702
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': -9.296792770195879, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.30)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: None, reward: 2.47306546523
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.4730654652256705, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.47)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: forward, reward: 2.95860180236
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.958601802357056, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.96)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: right, reward: 1.40907169118
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 1.4090716911800185, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent drove right instead of forward. (rewarded 1.41)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: left, reward: 2.54425298663
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 2.5442529866291403, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.54)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: None, reward: 1.35829931998
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.3582993199791256, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.36)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: left, reward: 2.56837330843
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 2.568373308431368, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.57)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: forward, reward: 1.82275736724
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 1.822757367241563, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.82)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: forward, reward: 1.20237528115
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 1.2023752811467672, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.20)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 108
\-------------------------
Environment.reset(): Trial set up with start = (3, 7), destination = (5, 5), deadline = 20
Simulating trial. . .
epsilon = 0.3430; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: right, reward: 1.62510784084
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.6251078408428632, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.63)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: right, reward: 1.86129189498
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.8612918949807038, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.86)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: -9.86661661605
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': -9.866616616051214, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -9.87)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: right, reward: 0.767938717488
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 0.7679387174880697, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 0.77)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: forward, reward: 1.20420746006
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.2042074600637411, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove forward instead of left. (rewarded 1.20)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: None, reward: 2.81294543034
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.8129454303375514, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.81)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: None, reward: 2.37882669442
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.3788266944191925, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.38)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: left, reward: 1.78956742497
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 1.789567424970632, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.79)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: -10.1647119777
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': -10.164711977704256, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.16)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: 1.4009668305
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.4009668304994607, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.40)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: left, reward: 2.72625332268
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'left', 'reward': 2.726253322680666, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.73)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: forward, reward: 2.7038674515
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 2.703867451500522, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.70)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: right, reward: 0.938158222934
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 0.9381582229340131, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove right instead of forward. (rewarded 0.94)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: None, reward: 1.66402739558
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 7, 't': 13, 'action': None, 'reward': 1.6640273955751275, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.66)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: left, reward: -39.6302627385
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 6, 't': 14, 'action': 'left', 'reward': -39.63026273853191, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.63)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: None, reward: 2.37282656834
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 5, 't': 15, 'action': None, 'reward': 2.3728265683365786, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.37)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: left, reward: 2.1800861627
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 4, 't': 16, 'action': 'left', 'reward': 2.1800861627039807, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.18)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: None, reward: 0.994630135124
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 3, 't': 17, 'action': None, 'reward': 0.9946301351235218, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.99)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: forward, reward: -9.36230057044
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 2, 't': 18, 'action': 'forward', 'reward': -9.362300570440736, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.36)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: left, reward: 0.714908207978
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 1, 't': 19, 'action': 'left', 'reward': 0.7149082079776681, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.71)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 109
\-------------------------
Environment.reset(): Trial set up with start = (8, 3), destination = (2, 6), deadline = 25
Simulating trial. . .
epsilon = 0.3396; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 1.75905040933
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 1.7590504093327863, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.76)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 2.94487392137
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.9448739213669786, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.94)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 2.13124959317
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.1312495931730666, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.13)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 1.79370950941
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.793709509405578, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.79)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: -5.67111680032
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': -5.671116800320673, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.67)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: left, reward: 1.53203019662
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 1.5320301966218557, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.53)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: left, reward: -10.293604672
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': 'left', 'reward': -10.293604672045548, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -10.29)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: forward, reward: 1.94850982204
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.9485098220428854, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 1.95)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: forward, reward: 1.28304257377
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 1.2830425737733064, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.28)
64% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 110
\-------------------------
Environment.reset(): Trial set up with start = (4, 3), destination = (6, 7), deadline = 20
Simulating trial. . .
epsilon = 0.3362; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: right, reward: 0.0108952111748
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 0.01089521117483594, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 0.01)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: right, reward: 1.60692315443
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.6069231544312539, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.61)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: right, reward: 1.47804568489
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.4780456848927308, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.48)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: left, reward: -19.1340402992
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': -19.134040299171385, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.13)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: 1.30298617966
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.3029861796627853, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.30)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: forward, reward: 2.24811083707
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.2481108370705245, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.25)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: left, reward: 1.76342914751
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 1.7634291475120139, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.76)
65% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 111
\-------------------------
Environment.reset(): Trial set up with start = (1, 5), destination = (8, 2), deadline = 20
Simulating trial. . .
epsilon = 0.3329; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 2.55747547559
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.557475475590773, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.56)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: right, reward: 0.482651908028
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 0.482651908027547, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent drove right instead of forward. (rewarded 0.48)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: left, reward: 1.89994348882
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'left', 'reward': 1.899943488820478, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.90)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: right, reward: 2.78551864296
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 2.785518642963314, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.79)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: forward, reward: 2.03217904423
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.0321790442340584, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.03)
75% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 112
\-------------------------
Environment.reset(): Trial set up with start = (7, 7), destination = (3, 6), deadline = 25
Simulating trial. . .
epsilon = 0.3296; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: right, reward: 2.75460460978
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.7546046097779033, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.75)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: right, reward: 1.10107233158
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'right'), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.1010723315798463, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'right')
Agent drove right instead of forward. (rewarded 1.10)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: None, reward: 2.42472917118
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.4247291711780674, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.42)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: None, reward: 1.82314839431
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.823148394310695, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.82)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: left, reward: 1.88500207539
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 1.8850020753901244, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 1.89)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 1.63629118203
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.6362911820322283, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.64)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 2.09179797424
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.091797974242568, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.09)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: forward, reward: 2.89237557116
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.8923755711613883, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.89)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: forward, reward: 1.7784813691
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 1.7784813690971397, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.78)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 7), heading: (0, -1), action: left, reward: 2.43055718391
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 16, 't': 9, 'action': 'left', 'reward': 2.430557183912491, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.43)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 7), heading: (0, -1), action: None, reward: 1.47374847517
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.473748475165189, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.47)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: forward, reward: 1.48136416586
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 1.4813641658550132, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.48)
52% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 113
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (3, 3), deadline = 25
Simulating trial. . .
epsilon = 0.3263; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: left, reward: -40.4734376505
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 25, 't': 0, 'action': 'left', 'reward': -40.4734376505139, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.47)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: right, reward: -19.7002406149
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 24, 't': 1, 'action': 'right', 'reward': -19.70024061491556, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.70)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: right, reward: 2.30036061183
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 2.3003606118344955, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.30)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: None, reward: 1.43240067995
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.4324006799517166, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.43)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: None, reward: 1.82340762328
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.823407623280913, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.82)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: left, reward: -20.9489200454
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': -20.948920045425634, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.95)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: forward, reward: 2.58140663883
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 2.5814066388346895, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.58)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: forward, reward: 1.63517693219
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.6351769321943772, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.64)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: left, reward: 2.30990348172
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'left', 'reward': 2.3099034817165576, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.31)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: right, reward: 1.55093734466
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 1.5509373446552093, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 1.55)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: right, reward: 0.805271601398
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 0.8052716013975493, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.81)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: right, reward: 1.7217856951
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 1.7217856951018136, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.72)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: right, reward: 1.42069383869
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 1.42069383868816, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.42)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: None, reward: -5.82210107077
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 13, 'action': None, 'reward': -5.822101070774321, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.82)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: forward, reward: 0.847373406744
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 14, 'action': 'forward', 'reward': 0.8473734067439993, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.85)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 114
\-------------------------
Environment.reset(): Trial set up with start = (3, 2), destination = (5, 4), deadline = 20
Simulating trial. . .
epsilon = 0.3230; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: left, reward: 1.24696502749
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.246965027487935, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.25)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: 2.54207539936
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 2.542075399355336, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.54)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: right, reward: 2.41490478278
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.4149047827805363, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.41)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 2.0580399689
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.0580399689016646, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.06)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: right, reward: 0.767347255557
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 0.767347255557017, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.77)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: right, reward: 0.192179358199
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 0.1921793581993898, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.19)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: right, reward: 2.56993665416
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 2.569936654159844, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.57)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: right, reward: 1.37931905305
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 1.3793190530471389, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.38)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: forward, reward: 2.81293726618
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 2.812937266178824, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.81)
55% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 115
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (1, 5), deadline = 25
Simulating trial. . .
epsilon = 0.3198; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: None, reward: 2.24813495987
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 25, 't': 0, 'action': None, 'reward': 2.2481349598689238, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.25)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: None, reward: 1.51081601189
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'right'), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.510816011894515, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 1.51)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: left, reward: -9.81046081574
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': 'left', 'reward': -9.810460815735665, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.81)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: forward, reward: 1.89823657718
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.8982365771818135, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.90)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: left, reward: 0.569965705365
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 0.5699657053645958, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded 0.57)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: None, reward: -4.39030954995
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': -4.390309549952592, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.39)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: None, reward: 1.80489678885
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.8048967888511287, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.80)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: right, reward: 1.29309614547
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 1.29309614547417, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.29)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: forward, reward: 1.66763480761
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 1.6676348076145637, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.67)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 0.800551706324
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 0.800551706324333, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 0.80)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: left, reward: -9.19118795567
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 15, 't': 10, 'action': 'left', 'reward': -9.191187955671003, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.19)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 1.23947277466
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 1.23947277465714, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.24)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: right, reward: 2.69420841755
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 2.694208417546934, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.69)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: right, reward: 1.47978536803
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 12, 't': 13, 'action': 'right', 'reward': 1.4797853680265374, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.48)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: left, reward: -40.8253092282
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 11, 't': 14, 'action': 'left', 'reward': -40.82530922819604, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.83)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 2.28259377221
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'left'), 'deadline': 10, 't': 15, 'action': None, 'reward': 2.282593772212145, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'left')
Agent properly idled at a red light. (rewarded 2.28)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 1.46794825729
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 9, 't': 16, 'action': 'right', 'reward': 1.4679482572938678, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 1.47)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 2.11378791773
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 8, 't': 17, 'action': 'right', 'reward': 2.113787917728604, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.11)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 0.462004243636
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 7, 't': 18, 'action': None, 'reward': 0.46200424363597503, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 0.46)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: -5.39874256926
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 6, 't': 19, 'action': None, 'reward': -5.398742569257244, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.40)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: right, reward: 1.50358961361
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'forward'), 'deadline': 5, 't': 20, 'action': 'right', 'reward': 1.5035896136145568, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'forward')
Agent followed the waypoint right. (rewarded 1.50)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: right, reward: 1.23244362704
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 4, 't': 21, 'action': 'right', 'reward': 1.232443627040436, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.23)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: left, reward: 0.517799435381
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 3, 't': 22, 'action': 'left', 'reward': 0.5177994353814921, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.52)
8% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 116
\-------------------------
Environment.reset(): Trial set up with start = (3, 4), destination = (7, 5), deadline = 25
Simulating trial. . .
epsilon = 0.3166; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: right, reward: 0.982974130834
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 0.9829741308338689, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 0.98)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: forward, reward: 2.46142237193
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 2.461422371929573, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.46)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: forward, reward: 1.62948229197
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': 1.6294822919720842, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.63)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: left, reward: -40.1211978874
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 22, 't': 3, 'action': 'left', 'reward': -40.12119788739673, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.12)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: None, reward: 1.66992639769
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.6699263976911924, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.67)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: forward, reward: 1.27087109811
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.2708710981144145, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.27)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 5), heading: (0, 1), action: right, reward: 2.26607059371
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 2.2660705937146375, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent followed the waypoint right. (rewarded 2.27)
72% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 117
\-------------------------
Environment.reset(): Trial set up with start = (3, 6), destination = (8, 2), deadline = 25
Simulating trial. . .
epsilon = 0.3135; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: None, reward: 1.28769913033
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.2876991303258922, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.29)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: None, reward: 2.47095201726
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.4709520172626416, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.47)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: None, reward: 1.00885589824
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.0088558982385019, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.01)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: left, reward: 1.22725384357
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'left', 'reward': 1.2272538435670364, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.23)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: -5.73525590326
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 21, 't': 4, 'action': None, 'reward': -5.735255903257574, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.74)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: forward, reward: -40.3987543637
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': -40.39875436365869, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.40)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 2.59268040327
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.5926804032695223, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.59)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: forward, reward: -9.42522108651
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': -9.42522108651257, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.43)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: right, reward: -19.6688040028
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 17, 't': 8, 'action': 'right', 'reward': -19.66880400277871, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.67)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: right, reward: 0.125719229132
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 0.12571922913241296, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent drove right instead of forward. (rewarded 0.13)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: forward, reward: 0.351392003703
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 0.3513920037027686, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove forward instead of left. (rewarded 0.35)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: right, reward: 1.30093918696
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'left'), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 1.3009391869648192, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'left')
Agent drove right instead of left. (rewarded 1.30)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: right, reward: -0.220944504878
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 13, 't': 12, 'action': 'right', 'reward': -0.22094450487824213, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded -0.22)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: right, reward: 1.66827273185
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 12, 't': 13, 'action': 'right', 'reward': 1.6682727318523678, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.67)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: right, reward: 0.765695695443
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 11, 't': 14, 'action': 'right', 'reward': 0.7656956954429296, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent drove right instead of forward. (rewarded 0.77)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: None, reward: 2.45875422991
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 10, 't': 15, 'action': None, 'reward': 2.458754229911116, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.46)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: left, reward: 2.08426935186
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 9, 't': 16, 'action': 'left', 'reward': 2.0842693518612165, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 2.08)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: 1.00943311145
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 8, 't': 17, 'action': None, 'reward': 1.0094331114531612, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.01)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: 2.16784470451
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 7, 't': 18, 'action': None, 'reward': 2.1678447045130014, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.17)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: forward, reward: 1.85904558052
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 6, 't': 19, 'action': 'forward', 'reward': 1.8590455805192398, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.86)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: right, reward: 1.22790854853
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 5, 't': 20, 'action': 'right', 'reward': 1.2279085485293724, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.23)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: None, reward: 0.385103555165
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 4, 't': 21, 'action': None, 'reward': 0.3851035551652775, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 0.39)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: None, reward: 1.10472395649
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 3, 't': 22, 'action': None, 'reward': 1.1047239564945532, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.10)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: None, reward: 0.681510837492
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 2, 't': 23, 'action': None, 'reward': 0.6815108374921364, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.68)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: None, reward: 0.394433676378
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 1, 't': 24, 'action': None, 'reward': 0.39443367637847904, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.39)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 118
\-------------------------
Environment.reset(): Trial set up with start = (3, 4), destination = (8, 3), deadline = 20
Simulating trial. . .
epsilon = 0.3104; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: right, reward: 0.408086393964
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 0.4080863939639734, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.41)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: forward, reward: -9.38197793165
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': -9.381977931650738, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -9.38)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: left, reward: -9.86584990919
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': -9.86584990919109, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -9.87)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: None, reward: -4.58282762894
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 17, 't': 3, 'action': None, 'reward': -4.582827628935356, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -4.58)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: left, reward: 2.27245739157
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.2724573915694064, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.27)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: left, reward: 0.636657423673
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 0.6366574236728904, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove left instead of forward. (rewarded 0.64)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: right, reward: 2.07325734074
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 2.073257340744771, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.07)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: forward, reward: 2.60041464547
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.6004146454652872, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.60)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: right, reward: 1.83767804294
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.8376780429360335, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.84)
55% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 119
\-------------------------
Environment.reset(): Trial set up with start = (8, 3), destination = (4, 4), deadline = 25
Simulating trial. . .
epsilon = 0.3073; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: forward, reward: 0.589327217016
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 0.5893272170155868, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 0.59)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: None, reward: 1.70104837175
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.7010483717532394, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.70)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: right, reward: 1.22022773056
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.2202277305554246, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.22)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: forward, reward: 2.94660937037
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.946609370365689, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.95)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: forward, reward: 1.82204943097
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.822049430970794, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.82)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: right, reward: 0.00775766067085
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 0.007757660670849931, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 0.01)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: None, reward: 1.58871402468
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.5887140246769205, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.59)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: right, reward: 1.58566863642
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 1.5856686364189028, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.59)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: left, reward: 1.57133637231
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'left', 'reward': 1.5713363723140672, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.57)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: None, reward: 0.903145714416
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 16, 't': 9, 'action': None, 'reward': 0.9031457144159136, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 0.90)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: right, reward: 0.462953761624
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 0.46295376162365354, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 0.46)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: left, reward: -0.118470217555
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'left', 'reward': -0.1184702175549347, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded -0.12)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: None, reward: 2.41820320106
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 13, 't': 12, 'action': None, 'reward': 2.41820320106142, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.42)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: left, reward: 2.62804804398
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'left', 'reward': 2.628048043984065, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.63)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: forward, reward: 1.5265099401
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 11, 't': 14, 'action': 'forward', 'reward': 1.5265099400987645, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.53)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 0.781939379681
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 10, 't': 15, 'action': None, 'reward': 0.7819393796813487, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 0.78)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: 0.886857484196
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 16, 'action': 'forward', 'reward': 0.8868574841958001, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.89)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 4), heading: (0, -1), action: left, reward: 2.29587516142
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 8, 't': 17, 'action': 'left', 'reward': 2.295875161415435, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.30)
28% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 120
\-------------------------
Environment.reset(): Trial set up with start = (6, 4), destination = (1, 5), deadline = 20
Simulating trial. . .
epsilon = 0.3042; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 2.54339608301
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.543396083007532, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.54)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: right, reward: 0.519405167094
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 0.5194051670938408, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent drove right instead of left. (rewarded 0.52)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: right, reward: 2.90831647349
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.908316473493219, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.91)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 4), heading: (0, 1), action: right, reward: 1.89267412893
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.8926741289289888, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent drove right instead of forward. (rewarded 1.89)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: left, reward: 1.54238813209
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 1.5423881320852753, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.54)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: forward, reward: 2.27081309078
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.270813090782333, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.27)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: None, reward: 0.731068389236
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 0.7310683892359497, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.73)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: left, reward: 1.12526943155
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 1.1252694315539111, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent drove left instead of right. (rewarded 1.13)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: right, reward: 2.08536899102
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 2.0853689910199904, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.09)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 1.44655103376
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 1.4465510337606817, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 1.45)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: right, reward: 1.94008553249
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 1.9400855324890074, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.94)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: left, reward: 2.49079357869
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 2.490793578691272, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.49)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 121
\-------------------------
Environment.reset(): Trial set up with start = (5, 6), destination = (3, 3), deadline = 25
Simulating trial. . .
epsilon = 0.3012; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: right, reward: 2.8303460075
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'left'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.8303460075032643, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'left')
Agent followed the waypoint right. (rewarded 2.83)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: right, reward: 1.09475770818
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.0947577081809514, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.09)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: left, reward: -40.6681162738
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 23, 't': 2, 'action': 'left', 'reward': -40.66811627384732, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.67)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: None, reward: 1.68337895686
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.6833789568596311, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.68)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: left, reward: -20.2757422651
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 21, 't': 4, 'action': 'left', 'reward': -20.27574226507227, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.28)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: forward, reward: 1.80043489928
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.8004348992843695, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.80)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: right, reward: 0.466045206235
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 0.46604520623533563, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 0.47)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: right, reward: 1.48545596646
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 1.4854559664631435, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 1.49)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: right, reward: 1.04407736658
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 1.0440773665811902, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent followed the waypoint right. (rewarded 1.04)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: forward, reward: 0.622643725297
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 0.622643725297104, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent drove forward instead of right. (rewarded 0.62)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: right, reward: 2.6667861893
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 2.6667861892967215, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.67)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 1.75108158179
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 14, 't': 11, 'action': None, 'reward': 1.7510815817879648, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.75)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 0.809536357512
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 13, 't': 12, 'action': None, 'reward': 0.8095363575124526, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.81)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: -5.77523626784
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 12, 't': 13, 'action': None, 'reward': -5.775236267838156, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.78)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: left, reward: 1.49580820099
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 11, 't': 14, 'action': 'left', 'reward': 1.4958082009933622, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.50)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 122
\-------------------------
Environment.reset(): Trial set up with start = (7, 7), destination = (3, 6), deadline = 25
Simulating trial. . .
epsilon = 0.2982; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: forward, reward: 1.49736997444
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 1.4973699744352122, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.50)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 2.05755828964
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.0575582896375497, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.06)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 1.71030722369
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.710307223692275, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.71)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 1.54961353185
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.5496135318521393, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.55)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: 2.32255923997
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.322559239967923, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.32)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: right, reward: 0.668490870778
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 0.6684908707782221, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.67)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: forward, reward: 1.73216798069
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 1.7321679806870303, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 1.73)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: left, reward: 2.61950124157
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 2.6195012415744516, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.62)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.57973610198
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.5797361019763168, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.58)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.90781119279
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.9078111927883399, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.91)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.64022538757
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 15, 't': 10, 'action': None, 'reward': 2.6402253875658124, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.64)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 1.7262501048
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 1.7262501048015577, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.73)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 2.63462086724
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 13, 't': 12, 'action': None, 'reward': 2.6346208672394305, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.63)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: 0.236946420496
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 0.23694642049620174, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 0.24)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: left, reward: 1.42032980534
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 11, 't': 14, 'action': 'left', 'reward': 1.420329805344917, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.42)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: forward, reward: 0.254969445943
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'right'), 'deadline': 10, 't': 15, 'action': 'forward', 'reward': 0.254969445942572, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'right')
Agent drove forward instead of left. (rewarded 0.25)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: right, reward: -20.7379354127
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 9, 't': 16, 'action': 'right', 'reward': -20.737935412685864, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.74)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: None, reward: 2.19187137251
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 8, 't': 17, 'action': None, 'reward': 2.1918713725067818, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.19)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: None, reward: 0.588675666639
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 7, 't': 18, 'action': None, 'reward': 0.588675666638522, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.59)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: left, reward: 1.75654796258
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 6, 't': 19, 'action': 'left', 'reward': 1.756547962582323, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.76)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: right, reward: 0.581291120603
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 5, 't': 20, 'action': 'right', 'reward': 0.5812911206030531, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 0.58)
16% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 123
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (3, 4), deadline = 20
Simulating trial. . .
epsilon = 0.2952; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 1.26890185837
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.268901858365756, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent drove right instead of forward. (rewarded 1.27)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: 2.25730641021
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.2573064102079794, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.26)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: 1.57893355555
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.5789335555536546, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.58)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: forward, reward: 0.644166758161
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'right'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 0.6441667581611631, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'right')
Agent drove forward instead of left. (rewarded 0.64)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: None, reward: 1.78175986075
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.7817598607547052, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.78)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: left, reward: 2.01948068533
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 2.0194806853348286, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.02)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: None, reward: 1.24362923476
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.243629234759707, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.24)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: forward, reward: 1.51209690469
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.5120969046894406, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.51)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: right, reward: 2.03623608011
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 2.036236080114732, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.04)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: None, reward: 1.88411259256
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.884112592562623, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.88)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: None, reward: 2.13451386403
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 10, 't': 10, 'action': None, 'reward': 2.1345138640273005, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.13)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: None, reward: -5.99629538638
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 11, 'action': None, 'reward': -5.99629538637561, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -6.00)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: forward, reward: 1.31611748227
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': 1.3161174822680253, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.32)
35% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 124
\-------------------------
Environment.reset(): Trial set up with start = (1, 5), destination = (4, 2), deadline = 30
Simulating trial. . .
epsilon = 0.2923; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: -9.25114935155
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 30, 't': 0, 'action': 'forward', 'reward': -9.251149351549806, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent attempted driving forward through a red light. (rewarded -9.25)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: None, reward: 2.51612335215
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.51612335214606, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.52)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: None, reward: 2.9426173452
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.94261734519632, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.94)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: forward, reward: 1.69082800171
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 27, 't': 3, 'action': 'forward', 'reward': 1.6908280017145756, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.69)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: forward, reward: -10.8402149156
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': -10.840214915637603, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.84)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: 2.31633736723
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 25, 't': 5, 'action': None, 'reward': 2.3163373672319008, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 2.32)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: 2.41726486994
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.417264869938726, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.42)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: forward, reward: 2.42420415624
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 2.4242041562412417, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.42)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: right, reward: -19.1497607823
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 22, 't': 8, 'action': 'right', 'reward': -19.149760782274218, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.15)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 2.11488440015
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 9, 'action': None, 'reward': 2.11488440015054, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.11)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: 1.86971601949
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 1.8697160194938394, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.87)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: right, reward: 2.02016158769
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 2.0201615876942878, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.02)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: forward, reward: 1.14152660654
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 12, 'action': 'forward', 'reward': 1.1415266065365401, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.14)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: forward, reward: 1.12488544483
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 1.1248854448252839, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.12)
53% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 125
\-------------------------
Environment.reset(): Trial set up with start = (7, 4), destination = (1, 6), deadline = 20
Simulating trial. . .
epsilon = 0.2894; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: right, reward: 2.11239865668
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.1123986566797197, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent followed the waypoint right. (rewarded 2.11)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: forward, reward: 1.81367196564
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 1.8136719656367595, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.81)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: right, reward: 2.26242678247
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.2624267824710955, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.26)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: left, reward: -9.46533703342
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': -9.465337033424724, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.47)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: left, reward: -10.735900077
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': -10.735900076982913, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.74)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: forward, reward: 2.06569374074
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.0656937407371307, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.07)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 126
\-------------------------
Environment.reset(): Trial set up with start = (4, 7), destination = (6, 4), deadline = 25
Simulating trial. . .
epsilon = 0.2865; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: right, reward: 1.56119046567
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'left'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.561190465666254, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'left')
Agent drove right instead of left. (rewarded 1.56)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: right, reward: 1.56291521352
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.5629152135240685, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.56)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 2.81579412987
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.815794129865334, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.82)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 1.18025392194
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.1802539219414026, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.18)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 1.15706258175
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.1570625817520757, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.16)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: left, reward: -40.4216551482
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': -40.42165514816751, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.42)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: left, reward: 0.873066419839
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 19, 't': 6, 'action': 'left', 'reward': 0.8730664198388588, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded 0.87)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: right, reward: 1.51541563726
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 1.5154156372565306, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.52)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: forward, reward: -10.0490719716
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': -10.049071971633698, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent attempted driving forward through a red light. (rewarded -10.05)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: left, reward: 2.75638092691
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'left', 'reward': 2.756380926914839, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.76)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 127
\-------------------------
Environment.reset(): Trial set up with start = (6, 2), destination = (5, 5), deadline = 20
Simulating trial. . .
epsilon = 0.2837; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: forward, reward: -9.86915816082
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': -9.869158160819676, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent attempted driving forward through a red light. (rewarded -9.87)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: right, reward: 1.08715374503
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.0871537450293176, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent drove right instead of left. (rewarded 1.09)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: right, reward: 2.77392441221
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.7739244122130877, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.77)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: left, reward: 1.8587623749
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 1.8587623748970121, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.86)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: -4.69738564671
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 16, 't': 4, 'action': None, 'reward': -4.697385646714536, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.70)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: -4.71249047611
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 15, 't': 5, 'action': None, 'reward': -4.71249047611022, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.71)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: left, reward: -39.0094538235
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 14, 't': 6, 'action': 'left', 'reward': -39.00945382352952, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.01)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 2.30998287269
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 13, 't': 7, 'action': None, 'reward': 2.309982872693258, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.31)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 2.26028724357
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.2602872435673405, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.26)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 2.29444824698
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 11, 't': 9, 'action': None, 'reward': 2.2944482469781535, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.29)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 1.76423315754
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.7642331575407895, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.76)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: left, reward: 0.146030791377
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 0.14603079137702313, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded 0.15)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 5), heading: (0, 1), action: right, reward: 1.36065022495
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 1.3606502249519836, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.36)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 5), heading: (0, 1), action: None, reward: -0.331992992454
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'forward'), 'deadline': 7, 't': 13, 'action': None, 'reward': -0.3319929924541195, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded -0.33)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: right, reward: 1.97366920842
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 6, 't': 14, 'action': 'right', 'reward': 1.9736692084239096, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.97)
25% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 128
\-------------------------
Environment.reset(): Trial set up with start = (8, 4), destination = (5, 3), deadline = 20
Simulating trial. . .
epsilon = 0.2808; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: right, reward: 2.58755486761
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.5875548676056166, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 2.59)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 1.7424783206
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.7424783206003807, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.74)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: left, reward: -10.1792544927
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': -10.179254492700725, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.18)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 1.67042211461
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.6704221146141167, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.67)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: forward, reward: 1.15114572453
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.151145724525829, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.15)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: forward, reward: 1.04862671783
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.0486267178345023, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.05)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: right, reward: 2.04272908277
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 2.0427290827677123, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.04)
65% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 129
\-------------------------
Environment.reset(): Trial set up with start = (7, 6), destination = (1, 3), deadline = 25
Simulating trial. . .
epsilon = 0.2780; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: right, reward: 1.69575107129
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.6957510712859816, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.70)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 2.58502777283
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.585027772829396, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.59)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: right, reward: 1.01528780405
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.0152878040459286, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent drove right instead of forward. (rewarded 1.02)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: left, reward: 1.93459547445
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 22, 't': 3, 'action': 'left', 'reward': 1.9345954744493166, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.93)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: right, reward: 1.58075056488
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 1.580750564881574, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.58)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: forward, reward: 2.60084072263
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 2.600840722633868, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.60)
76% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 130
\-------------------------
Environment.reset(): Trial set up with start = (2, 5), destination = (5, 3), deadline = 25
Simulating trial. . .
epsilon = 0.2753; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: forward, reward: -39.5128987249
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'right'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': -39.51289872487786, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.51)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: 2.1545831489
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.154583148897925, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.15)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: 1.71950660371
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.719506603708897, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.72)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: 2.43197529945
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.431975299451529, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.43)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: -5.88213699766
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': None, 'reward': -5.882136997663836, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.88)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: right, reward: 0.908008667556
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 0.9080086675563702, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent drove right instead of forward. (rewarded 0.91)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: left, reward: 2.82853717352
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 19, 't': 6, 'action': 'left', 'reward': 2.8285371735212275, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 2.83)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 1.25946582541
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 18, 't': 7, 'action': None, 'reward': 1.259465825407214, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.26)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 2.51861984177
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.5186198417736536, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.52)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: 2.20975546454
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 2.209755464536369, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.21)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: forward, reward: 1.76672249375
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 1.7667224937536565, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.77)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: right, reward: 0.969057318343
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 0.9690573183425786, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 0.97)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: None, reward: 1.72151610881
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 13, 't': 12, 'action': None, 'reward': 1.7215161088117406, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.72)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: forward, reward: 1.44778325891
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 1.4477832589142938, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.45)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: forward, reward: 2.28687742351
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 11, 't': 14, 'action': 'forward', 'reward': 2.286877423511506, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 2.29)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 131
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (6, 4), deadline = 25
Simulating trial. . .
epsilon = 0.2725; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 1.73332269604
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.7333226960414996, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.73)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 2.94478609549
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.944786095494817, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.94)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 1.02667121159
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.02667121159239, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.03)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 0.9704564669
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 0.9704564668997588, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.97)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: left, reward: 2.59223194691
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 2.5922319469078094, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.59)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 2.88187469283
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.8818746928322287, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.88)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 0.92361585303
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 0.9236158530300831, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.92)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: left, reward: 1.92239066296
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 1.9223906629605798, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.92)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 1.97011002427
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 1.9701100242748928, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.97)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: None, reward: 2.83568331803
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 16, 't': 9, 'action': None, 'reward': 2.835683318028268, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.84)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: None, reward: 2.11237455607
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 15, 't': 10, 'action': None, 'reward': 2.1123745560736613, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.11)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: forward, reward: 1.81753329628
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 1.8175332962845119, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.82)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: right, reward: 2.44476381051
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 2.4447638105062905, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.44)
48% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 132
\-------------------------
Environment.reset(): Trial set up with start = (7, 5), destination = (3, 3), deadline = 30
Simulating trial. . .
epsilon = 0.2698; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: right, reward: 1.52087528447
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 1.5208752844714615, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 1.52)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: 2.86044700294
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 29, 't': 1, 'action': 'forward', 'reward': 2.860447002936259, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.86)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: forward, reward: 1.99023612738
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 28, 't': 2, 'action': 'forward', 'reward': 1.9902361273843536, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.99)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: 2.65934730809
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 2.659347308094634, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.66)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: -5.4255722646
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 26, 't': 4, 'action': None, 'reward': -5.425572264596213, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.43)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: forward, reward: 2.26735717329
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 2.2673571732934956, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.27)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: right, reward: 1.50232380725
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 24, 't': 6, 'action': 'right', 'reward': 1.502323807245602, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 1.50)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: None, reward: 2.59897666294
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 23, 't': 7, 'action': None, 'reward': 2.5989766629400286, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.60)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: None, reward: 1.7273591415
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 8, 'action': None, 'reward': 1.727359141503882, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.73)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: left, reward: 1.77020359212
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 9, 'action': 'left', 'reward': 1.7702035921184733, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded 1.77)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: forward, reward: 1.16541951603
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 1.165419516027712, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 1.17)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: right, reward: 2.12309963594
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 2.123099635940772, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.12)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: right, reward: 2.08045456863
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 18, 't': 12, 'action': 'right', 'reward': 2.0804545686343867, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 2.08)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: forward, reward: 1.97573237638
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 1.9757323763831813, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.98)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: None, reward: 1.8719011146
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 14, 'action': None, 'reward': 1.871901114603202, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.87)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: None, reward: -5.66634959716
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 15, 'action': None, 'reward': -5.6663495971603535, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.67)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: left, reward: 2.11383210951
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 14, 't': 16, 'action': 'left', 'reward': 2.1138321095083255, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.11)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: forward, reward: 1.1081307133
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 13, 't': 17, 'action': 'forward', 'reward': 1.1081307132986906, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.11)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 133
\-------------------------
Environment.reset(): Trial set up with start = (8, 3), destination = (3, 6), deadline = 30
Simulating trial. . .
epsilon = 0.2671; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: right, reward: 1.42906715743
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'left'), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 1.4290671574308904, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'left')
Agent followed the waypoint right. (rewarded 1.43)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: left, reward: -10.3585590504
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 29, 't': 1, 'action': 'left', 'reward': -10.358559050439792, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.36)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: right, reward: 2.70028079947
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': 'right', 'reward': 2.7002807994739237, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.70)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 1.07814730623
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.078147306234774, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.08)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 2.75178571989
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 26, 't': 4, 'action': None, 'reward': 2.751785719888036, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.75)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: forward, reward: 2.49122818607
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 2.49122818606609, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.49)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: forward, reward: 1.98042489807
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.9804248980650123, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.98)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: right, reward: 1.3281550374
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 23, 't': 7, 'action': 'right', 'reward': 1.3281550373967725, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.33)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: right, reward: 1.90790726688
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 22, 't': 8, 'action': 'right', 'reward': 1.9079072668774533, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 1.91)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: right, reward: 1.12031035596
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 21, 't': 9, 'action': 'right', 'reward': 1.1203103559559033, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.12)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: forward, reward: -10.8434790497
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': -10.843479049743918, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.84)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: right, reward: 2.24074467715
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 2.240744677152496, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.24)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 2.67443473603
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 12, 'action': None, 'reward': 2.67443473603264, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.67)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 7), heading: (0, -1), action: left, reward: 0.812435116093
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 13, 'action': 'left', 'reward': 0.8124351160930332, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 0.81)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: right, reward: -0.190209534994
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 16, 't': 14, 'action': 'right', 'reward': -0.19020953499385662, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent drove right instead of forward. (rewarded -0.19)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: None, reward: 2.36042167224
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 15, 't': 15, 'action': None, 'reward': 2.3604216722423472, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.36)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: left, reward: 1.11402623846
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 14, 't': 16, 'action': 'left', 'reward': 1.1140262384587576, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.11)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: left, reward: 2.43785408579
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 13, 't': 17, 'action': 'left', 'reward': 2.437854085792745, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.44)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 134
\-------------------------
Environment.reset(): Trial set up with start = (3, 3), destination = (7, 3), deadline = 20
Simulating trial. . .
epsilon = 0.2645; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: right, reward: 0.67659041278
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 0.6765904127804376, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'left')
Agent drove right instead of left. (rewarded 0.68)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: None, reward: 1.30481876135
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.3048187613546498, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.30)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: None, reward: 2.92140987242
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.9214098724223128, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.92)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: right, reward: 1.5290495367
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.5290495366959824, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 1.53)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: left, reward: 2.33322941597
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.3332294159662226, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.33)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: None, reward: 2.29887593821
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.298875938207612, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.30)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: forward, reward: 1.42277725732
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.4227772573244903, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.42)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: forward, reward: -9.14362064961
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': -9.14362064961145, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.14)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: forward, reward: -39.5853267677
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': -39.585326767729505, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.59)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: forward, reward: 2.40488380166
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 2.4048838016635976, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.40)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: None, reward: 2.31051368123
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'forward'), 'deadline': 10, 't': 10, 'action': None, 'reward': 2.3105136812317264, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 2.31)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: None, reward: 1.57993807173
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 9, 't': 11, 'action': None, 'reward': 1.5799380717321423, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.58)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: None, reward: 2.20711780014
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 8, 't': 12, 'action': None, 'reward': 2.2071178001445197, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.21)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: forward, reward: -0.0499858442228
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': -0.04998584422276531, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded -0.05)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 0.963501760791
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 6, 't': 14, 'action': None, 'reward': 0.9635017607906671, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.96)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 1.08323906763
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 5, 't': 15, 'action': None, 'reward': 1.0832390676285961, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.08)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: forward, reward: -0.177867446987
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 4, 't': 16, 'action': 'forward', 'reward': -0.1778674469870487, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded -0.18)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: None, reward: -5.01364337451
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 3, 't': 17, 'action': None, 'reward': -5.013643374507391, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.01)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: forward, reward: -0.700270438079
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 2, 't': 18, 'action': 'forward', 'reward': -0.7002704380785578, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded -0.70)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 0.985786156777
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 1, 't': 19, 'action': None, 'reward': 0.9857861567774877, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 0.99)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 135
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (6, 4), deadline = 25
Simulating trial. . .
epsilon = 0.2618; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: right, reward: 1.60465370158
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'left'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.6046537015753084, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'left')
Agent followed the waypoint right. (rewarded 1.60)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: left, reward: 0.194886927447
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 24, 't': 1, 'action': 'left', 'reward': 0.1948869274474061, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded 0.19)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: right, reward: 1.76209338004
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.7620933800426024, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.76)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.78539436367
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.785394363672646, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.79)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 2.29144223024
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.291442230242751, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.29)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: left, reward: -9.09156042119
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': -9.091560421185427, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.09)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: 2.32146284407
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.3214628440658833, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.32)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: right, reward: 1.50486629772
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 1.5048662977204523, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.50)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: forward, reward: 2.87872809711
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 2.8787280971137195, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.88)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: None, reward: 1.17668941413
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.176689414133162, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.18)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: forward, reward: -39.320798543
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': -39.32079854302649, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.32)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: forward, reward: 0.853759538875
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 0.8537595388746118, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 0.85)
52% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 136
\-------------------------
Environment.reset(): Trial set up with start = (6, 2), destination = (4, 4), deadline = 20
Simulating trial. . .
epsilon = 0.2592; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: right, reward: 1.22592687576
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.2259268757573485, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'left')
Agent followed the waypoint right. (rewarded 1.23)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: right, reward: 1.56577656718
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.5657765671824158, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.57)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: 1.96962446399
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.9696244639922587, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.97)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: right, reward: 0.924812288912
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 0.9248122889124276, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.92)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: forward, reward: 0.505969281695
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 0.5059692816946734, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 0.51)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: left, reward: 2.6230873822
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 2.623087382197756, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.62)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: left, reward: 2.11646773777
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 2.1164677377731316, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.12)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: None, reward: 1.55330031444
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.5533003144379052, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.55)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: None, reward: 0.97863237875
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 0.9786323787503182, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.98)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: forward, reward: 2.01061199163
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 2.01061199162944, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.01)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: forward, reward: 2.27403690348
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 2.274036903484637, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.27)
45% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 137
\-------------------------
Environment.reset(): Trial set up with start = (4, 7), destination = (6, 3), deadline = 20
Simulating trial. . .
epsilon = 0.2567; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: 2.22672801692
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.226728016920944, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.23)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: 1.99471448269
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.9947144826915675, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.99)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: 1.7048485074
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.704848507402301, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.70)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: 2.43637627708
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.436376277082328, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.44)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: left, reward: 1.49254831022
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 1.4925483102249806, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.49)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: 2.46661950835
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.4666195083481504, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.47)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: 1.66257873851
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'right'), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.6625787385121047, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 1.66)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: 2.43092168228
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 13, 't': 7, 'action': None, 'reward': 2.430921682279484, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.43)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: 1.00935853397
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.0093585339672304, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.01)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: forward, reward: 1.24830694467
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 1.248306944670153, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.25)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: right, reward: 2.33568620794
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 2.33568620793749, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.34)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: None, reward: 2.48714461664
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 9, 't': 11, 'action': None, 'reward': 2.4871446166440307, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.49)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: None, reward: 0.997004422485
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 8, 't': 12, 'action': None, 'reward': 0.9970044224846828, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.00)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: forward, reward: 1.87157102025
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 1.8715710202499627, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.87)
30% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 138
\-------------------------
Environment.reset(): Trial set up with start = (5, 5), destination = (4, 2), deadline = 20
Simulating trial. . .
epsilon = 0.2541; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: right, reward: 2.08752753744
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.0875275374399878, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'left')
Agent followed the waypoint right. (rewarded 2.09)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: None, reward: 1.61884556018
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.6188455601764244, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.62)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: None, reward: 1.93684202881
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.9368420288050232, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.94)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: None, reward: 2.30914252113
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.309142521130261, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.31)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: None, reward: 1.9552394945
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.9552394944983797, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.96)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 5), heading: (-1, 0), action: forward, reward: 0.423640775491
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 0.4236407754906031, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove forward instead of left. (rewarded 0.42)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 5), heading: (-1, 0), action: None, reward: 1.98860805289
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.9886080528882288, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.99)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: left, reward: 1.28100208001
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 1.2810020800114952, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.28)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: forward, reward: -0.0409292886551
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': -0.0409292886550775, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent drove forward instead of left. (rewarded -0.04)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: None, reward: 2.03217305335
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 11, 't': 9, 'action': None, 'reward': 2.0321730533467965, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.03)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: None, reward: 2.61003427591
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 10, 't': 10, 'action': None, 'reward': 2.610034275910899, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.61)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: left, reward: 0.772415092176
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 0.7724150921759532, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 0.77)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: right, reward: 0.942388971283
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 0.9423889712834659, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 0.94)
35% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 139
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (4, 3), deadline = 30
Simulating trial. . .
epsilon = 0.2516; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 2.17733106989
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 30, 't': 0, 'action': None, 'reward': 2.177331069893296, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.18)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 1.1141140614
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.1141140613996994, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.11)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: left, reward: -9.67437743957
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': 'left', 'reward': -9.67437743956521, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -9.67)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 1.90146834697
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.901468346967656, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.90)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 0.201003377545
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 26, 't': 4, 'action': 'right', 'reward': 0.20100337754491482, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 0.20)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: 1.52543414057
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 25, 't': 5, 'action': None, 'reward': 1.5254341405667862, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.53)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: 2.56694516072
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.5669451607159086, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.57)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 1.29215249056
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 23, 't': 7, 'action': 'right', 'reward': 1.2921524905570592, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 1.29)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: -5.08568978771
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 22, 't': 8, 'action': None, 'reward': -5.08568978771264, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.09)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: left, reward: 2.28823828374
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 21, 't': 9, 'action': 'left', 'reward': 2.2882382837369315, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.29)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: None, reward: 1.9162061564
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 10, 'action': None, 'reward': 1.9162061563951924, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.92)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: None, reward: 2.80721094666
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 19, 't': 11, 'action': None, 'reward': 2.8072109466552053, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.81)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: None, reward: 1.80465457125
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 18, 't': 12, 'action': None, 'reward': 1.8046545712508302, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.80)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: right, reward: 0.671875308107
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 17, 't': 13, 'action': 'right', 'reward': 0.6718753081066289, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent drove right instead of left. (rewarded 0.67)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: None, reward: 0.867938592882
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 16, 't': 14, 'action': None, 'reward': 0.8679385928816956, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 0.87)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: forward, reward: 1.46928315119
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 15, 't': 15, 'action': 'forward', 'reward': 1.469283151185431, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.47)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: forward, reward: 1.11159429338
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 16, 'action': 'forward', 'reward': 1.1115942933789764, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.11)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: forward, reward: 1.02678309398
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 17, 'action': 'forward', 'reward': 1.0267830939797284, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.03)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: left, reward: 1.55613529608
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 12, 't': 18, 'action': 'left', 'reward': 1.5561352960794748, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.56)
37% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 140
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (2, 7), deadline = 20
Simulating trial. . .
epsilon = 0.2491; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: -5.06205375182
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 20, 't': 0, 'action': None, 'reward': -5.062053751821782, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.06)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: 1.16527260303
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'forward'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 1.165272603028531, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'forward')
Agent drove forward instead of right. (rewarded 1.17)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: forward, reward: 2.86185044116
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 2.8618504411641124, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.86)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: left, reward: 0.056063749178
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 0.05606374917798507, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent drove left instead of forward. (rewarded 0.06)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: right, reward: 1.50937588995
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 1.50937588995135, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 1.51)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: forward, reward: 2.78979795008
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.789797950078836, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.79)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: right, reward: 0.937160191962
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 0.9371601919623858, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 0.94)
65% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 141
\-------------------------
Environment.reset(): Trial set up with start = (6, 5), destination = (3, 4), deadline = 20
Simulating trial. . .
epsilon = 0.2466; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: None, reward: 1.93720883037
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.9372088303707962, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.94)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: None, reward: 1.11421641171
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.1142164117144604, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.11)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: None, reward: 1.19934975052
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.199349750521615, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.20)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: left, reward: 2.63962073622
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 2.6396207362179833, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.64)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: right, reward: 0.593602282686
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 0.5936022826859356, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.59)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: left, reward: -10.9320947842
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': -10.932094784216739, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.93)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: None, reward: -4.64000358871
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': -4.6400035887120925, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.64)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: left, reward: 2.17174014311
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 2.171740143105154, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.17)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: forward, reward: -40.4881597125
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': -40.48815971248691, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.49)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: None, reward: 2.49901414239
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 9, 'action': None, 'reward': 2.499014142392432, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.50)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: left, reward: -39.6800990716
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 10, 't': 10, 'action': 'left', 'reward': -39.680099071628966, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.68)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: forward, reward: 1.59971019011
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 1.5997101901135153, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.60)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 142
\-------------------------
Environment.reset(): Trial set up with start = (8, 2), destination = (7, 5), deadline = 20
Simulating trial. . .
epsilon = 0.2441; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: right, reward: 1.91901435833
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.9190143583341683, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 1.92)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: left, reward: 2.2763121239
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': 2.276312123900359, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 2.28)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: right, reward: 1.26298831369
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.2629883136871638, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.26)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: right, reward: 1.62376152799
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.6237615279910962, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 1.62)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 5), heading: (0, -1), action: left, reward: 2.09312926571
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.093129265714249, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.09)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 5), heading: (0, -1), action: left, reward: -39.02273149
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 15, 't': 5, 'action': 'left', 'reward': -39.02273149001816, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.02)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: right, reward: 1.5780229289
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.5780229289041132, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 1.58)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: None, reward: 0.201830748117
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 13, 't': 7, 'action': None, 'reward': 0.2018307481167254, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 0.20)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: right, reward: 0.98573673851
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 0.9857367385102511, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 0.99)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: right, reward: 1.91085527553
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 1.9108552755318116, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.91)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: forward, reward: 1.96009140205
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 1.9600914020540237, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.96)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: right, reward: 1.57905403208
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'left'), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 1.5790540320831334, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'left')
Agent followed the waypoint right. (rewarded 1.58)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 143
\-------------------------
Environment.reset(): Trial set up with start = (3, 5), destination = (7, 6), deadline = 25
Simulating trial. . .
epsilon = 0.2417; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: left, reward: 0.238008308188
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 0.23800830818830465, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 0.24)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: None, reward: 1.14441885582
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.1444188558159305, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.14)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: right, reward: 2.66208359056
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 2.6620835905601092, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.66)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 1.9891081081
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.9891081081045612, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.99)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 1.12970484325
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.1297048432465593, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.13)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: left, reward: -19.4929107978
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': -19.492910797827303, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.49)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: forward, reward: 2.40799516633
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 2.4079951663322183, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.41)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: 2.01214066187
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 18, 't': 7, 'action': None, 'reward': 2.012140661866553, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.01)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: 2.77832293933
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.778322939325011, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.78)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: 1.08909151753
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.0890915175289575, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.09)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: 2.0122592751
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 2.0122592750967248, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.01)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: 2.18878817346
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'right'), 'deadline': 14, 't': 11, 'action': None, 'reward': 2.1887881734593826, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 2.19)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: 1.86272391851
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 13, 't': 12, 'action': None, 'reward': 1.862723918509663, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.86)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: forward, reward: 2.39763036429
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 2.397630364294459, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.40)
44% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 144
\-------------------------
Environment.reset(): Trial set up with start = (5, 2), destination = (2, 6), deadline = 25
Simulating trial. . .
epsilon = 0.2393; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: left, reward: 2.88088442087
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 2.8808844208723894, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.88)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: forward, reward: 2.06407507568
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 2.0640750756758877, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 2.06)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 1.91394627369
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.9139462736868949, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.91)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 1.78200031081
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.7820003108083102, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.78)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 2.8782138088
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.8782138088035127, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 2.88)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: forward, reward: 2.08630605699
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 2.086306056989574, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.09)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: right, reward: 2.25675880029
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 2.25675880029279, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 2.26)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: forward, reward: 1.61004296516
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.6100429651635146, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.61)
68% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 145
\-------------------------
Environment.reset(): Trial set up with start = (6, 4), destination = (1, 6), deadline = 25
Simulating trial. . .
epsilon = 0.2369; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: forward, reward: 2.03682939228
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 2.0368293922753713, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 2.04)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: forward, reward: 2.3699343074
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 2.369934307400262, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.37)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 1.2815417491
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.2815417491031489, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.28)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: right, reward: -19.8556820753
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 22, 't': 3, 'action': 'right', 'reward': -19.855682075318555, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.86)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: left, reward: 0.740323402696
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 0.7403234026959068, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded 0.74)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: right, reward: 1.38168375332
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 1.381683753323617, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.38)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 1.63717943732
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.6371794373216932, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.64)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: -0.0842955017733
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'right'), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': -0.08429550177332068, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'right')
Agent drove forward instead of left. (rewarded -0.08)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.73459803381
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.7345980338130336, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.73)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 0.497432246737
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 0.4974322467368393, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 0.50)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: forward, reward: -10.2230774035
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'right', 'left'), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': -10.223077403528455, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'left')
Agent attempted driving forward through a red light. (rewarded -10.22)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: right, reward: 1.71579379241
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 1.7157937924129356, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.72)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: forward, reward: -9.94379844668
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 13, 't': 12, 'action': 'forward', 'reward': -9.943798446683576, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.94)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: left, reward: 1.55117790284
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'left', 'reward': 1.5511779028368413, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.55)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: None, reward: 1.88346272244
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 14, 'action': None, 'reward': 1.8834627224373568, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.88)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: left, reward: 0.701373812721
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 10, 't': 15, 'action': 'left', 'reward': 0.7013738127208806, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 0.70)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: right, reward: 1.78510318929
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 9, 't': 16, 'action': 'right', 'reward': 1.785103189290844, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.79)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: right, reward: 1.79251954481
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 8, 't': 17, 'action': 'right', 'reward': 1.792519544810131, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.79)
28% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 146
\-------------------------
Environment.reset(): Trial set up with start = (8, 5), destination = (3, 3), deadline = 25
Simulating trial. . .
epsilon = 0.2346; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 5), heading: (0, 1), action: left, reward: -19.2701487011
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'right', 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 25, 't': 0, 'action': 'left', 'reward': -19.270148701058993, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.27)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: forward, reward: 1.37584966767
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 1.3758496676725176, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 1.38)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: right, reward: 0.991843038993
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 0.9918430389927646, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove right instead of left. (rewarded 0.99)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 2.55045356612
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.5504535661198373, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.55)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 2.50680800143
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.506808001434636, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.51)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: right, reward: -20.6674687634
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 20, 't': 5, 'action': 'right', 'reward': -20.667468763394325, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.67)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: left, reward: 2.45521545384
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 19, 't': 6, 'action': 'left', 'reward': 2.4552154538420545, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 2.46)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: left, reward: -40.0051336982
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 18, 't': 7, 'action': 'left', 'reward': -40.00513369822316, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.01)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: left, reward: -9.10724947854
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 8, 'action': 'left', 'reward': -9.10724947853929, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -9.11)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 1.47396910922
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.4739691092177472, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.47)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: left, reward: 0.841104082438
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 15, 't': 10, 'action': 'left', 'reward': 0.8411040824378258, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 0.84)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: 1.25043214546
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 1.2504321454639877, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.25)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: forward, reward: 1.78714199699
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 13, 't': 12, 'action': 'forward', 'reward': 1.7871419969868325, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 1.79)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: forward, reward: 2.28292424074
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 2.282924240742786, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.28)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: right, reward: 0.841997526628
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 11, 't': 14, 'action': 'right', 'reward': 0.8419975266283319, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 0.84)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: forward, reward: 2.1963448189
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 10, 't': 15, 'action': 'forward', 'reward': 2.1963448188960717, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.20)
36% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 147
\-------------------------
Environment.reset(): Trial set up with start = (7, 7), destination = (1, 3), deadline = 20
Simulating trial. . .
epsilon = 0.2322; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.25040051116
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.250400511163147, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.25)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.36440076848
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.3644007684762904, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.36)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.85546772339
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.855467723390369, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.86)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.80479183013
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.804791830128595, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.80)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: right, reward: 0.880612847892
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 0.8806128478924343, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 0.88)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: right, reward: 2.41195810563
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 2.411958105631192, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.41)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 2.31221121003
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.312211210034329, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.31)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: right, reward: 1.1330421428
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 1.1330421428017552, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent drove right instead of forward. (rewarded 1.13)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: None, reward: 1.4741039885
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.47410398850058, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.47)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: None, reward: 2.5559044799
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 11, 't': 9, 'action': None, 'reward': 2.5559044798972947, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.56)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: right, reward: 0.533698351255
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 0.5336983512553711, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.53)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: left, reward: 1.24632269682
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 1.2463226968237577, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.25)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: forward, reward: -39.5794073771
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': -39.57940737714018, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.58)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: right, reward: -0.205574775623
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 7, 't': 13, 'action': 'right', 'reward': -0.20557477562251225, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded -0.21)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: None, reward: 1.97387382785
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.9738738278531702, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.97)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: left, reward: 1.04393810282
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 5, 't': 15, 'action': 'left', 'reward': 1.0439381028179395, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.04)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: left, reward: 2.18628580254
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 4, 't': 16, 'action': 'left', 'reward': 2.186285802542309, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.19)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: None, reward: 1.4116803292
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 3, 't': 17, 'action': None, 'reward': 1.4116803291992632, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.41)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: None, reward: 1.22085443798
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 2, 't': 18, 'action': None, 'reward': 1.2208544379805764, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 1.22)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: forward, reward: 2.11758005803
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 1, 't': 19, 'action': 'forward', 'reward': 2.11758005802585, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent followed the waypoint forward. (rewarded 2.12)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 148
\-------------------------
Environment.reset(): Trial set up with start = (8, 2), destination = (5, 5), deadline = 30
Simulating trial. . .
epsilon = 0.2299; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: -9.20203773876
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 30, 't': 0, 'action': 'forward', 'reward': -9.202037738759225, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.20)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: -40.9791772421
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 29, 't': 1, 'action': 'forward', 'reward': -40.97917724212881, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.98)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: 2.13629407484
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.1362940748400527, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.14)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: 2.1179994394
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 2.1179994394025883, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.12)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: left, reward: 1.62963593057
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 26, 't': 4, 'action': 'left', 'reward': 1.6296359305690142, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 1.63)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: left, reward: 1.32419498665
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 25, 't': 5, 'action': 'left', 'reward': 1.324194986650547, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.32)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 1.80955923631
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.8095592363094342, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.81)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: 2.56781629448
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 7, 'action': None, 'reward': 2.5678162944820846, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.57)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: 2.2925920313
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.292592031297093, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.29)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: 1.25257657721
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 1.2525765772129107, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.25)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: forward, reward: 1.08123081845
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 1.0812308184490738, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 1.08)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: right, reward: 2.71303277209
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'left'), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 2.713032772087427, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'left')
Agent followed the waypoint right. (rewarded 2.71)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: right, reward: 0.867423487811
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 18, 't': 12, 'action': 'right', 'reward': 0.8674234878113387, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 0.87)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 1.81823193728
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 13, 'action': None, 'reward': 1.8182319372789872, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.82)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 1.84852921446
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 16, 't': 14, 'action': None, 'reward': 1.8485292144562515, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.85)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 1.3676284207
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 15, 't': 15, 'action': None, 'reward': 1.3676284207017773, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.37)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: left, reward: -19.6658087582
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 14, 't': 16, 'action': 'left', 'reward': -19.66580875818028, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.67)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: left, reward: 2.24154995228
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 13, 't': 17, 'action': 'left', 'reward': 2.241549952284332, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.24)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 149
\-------------------------
Environment.reset(): Trial set up with start = (5, 7), destination = (4, 4), deadline = 20
Simulating trial. . .
epsilon = 0.2276; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: right, reward: 1.97793227879
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.9779322787925282, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.98)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: None, reward: 2.07627717407
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.0762771740736943, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.08)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: None, reward: 2.19156182276
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.1915618227581826, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.19)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: right, reward: 1.07431919948
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.0743191994752461, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent drove right instead of left. (rewarded 1.07)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: forward, reward: 2.20337277373
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.203372773725435, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.20)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 4), heading: (0, -1), action: forward, reward: 1.38457625666
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.384576256656131, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.38)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 150
\-------------------------
Environment.reset(): Trial set up with start = (1, 3), destination = (6, 4), deadline = 20
Simulating trial. . .
epsilon = 0.2254; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: right, reward: 2.16404102513
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.1640410251269366, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 2.16)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: right, reward: 1.40093712392
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.4009371239214405, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent drove right instead of forward. (rewarded 1.40)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: None, reward: -4.75872495247
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': -4.758724952468032, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -4.76)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: left, reward: 1.81968699246
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 1.8196869924554435, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent followed the waypoint left. (rewarded 1.82)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: left, reward: -39.6526805803
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 16, 't': 4, 'action': 'left', 'reward': -39.652680580323945, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.65)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: None, reward: -4.79282135914
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': -4.792821359144882, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.79)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: forward, reward: 1.37000242406
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.3700024240559916, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.37)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: None, reward: 1.19828192986
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.1982819298645346, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.20)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: None, reward: 1.85840371783
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.8584037178294057, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.86)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: forward, reward: 0.312276425846
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 0.31227642584623005, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove forward instead of left. (rewarded 0.31)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: left, reward: 1.87965420015
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'left', 'reward': 1.87965420014506, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.88)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: left, reward: 1.29296502926
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 1.2929650292627792, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.29)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 4), heading: (0, 1), action: right, reward: 1.30750645532
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 1.30750645532446, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.31)
35% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 151
\-------------------------
Environment.reset(): Trial set up with start = (2, 2), destination = (7, 6), deadline = 25
Simulating trial. . .
epsilon = 0.2231; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 2.00786991621
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 2.0078699162065075, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.01)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: right, reward: 1.00020397595
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.0002039759544266, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent drove right instead of forward. (rewarded 1.00)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: left, reward: 2.70644413669
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 23, 't': 2, 'action': 'left', 'reward': 2.706444136688927, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.71)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: right, reward: 0.543558112809
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 22, 't': 3, 'action': 'right', 'reward': 0.5435581128092162, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent drove right instead of forward. (rewarded 0.54)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: right, reward: 0.489877538974
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 0.48987753897353936, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.49)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 1.97866992978
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 1.9786699297781483, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent followed the waypoint right. (rewarded 1.98)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 2.40781170598
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 2.407811705983403, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.41)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 1.05045566578
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.0504556657767692, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.05)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: right, reward: 1.03608112763
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', 'left'), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 1.0360811276250737, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'left')
Agent followed the waypoint right. (rewarded 1.04)
64% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 152
\-------------------------
Environment.reset(): Trial set up with start = (4, 3), destination = (8, 7), deadline = 30
Simulating trial. . .
epsilon = 0.2209; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: None, reward: 2.87934417339
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 30, 't': 0, 'action': None, 'reward': 2.8793441733870813, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.88)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: None, reward: 1.02675233883
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.0267523388342048, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.03)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: None, reward: 1.18353178804
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.183531788042247, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.18)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: None, reward: 1.73652457362
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.7365245736215824, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.74)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: left, reward: 1.54202686291
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 26, 't': 4, 'action': 'left', 'reward': 1.5420268629113034, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.54)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: left, reward: 1.22614466983
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'left', 'reward': 1.2261446698316336, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.23)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: forward, reward: 1.75475494261
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.7547549426128255, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 1.75)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: forward, reward: -9.0927533684
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': -9.092753368404939, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.09)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 2.5404934175
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.54049341750292, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.54)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: left, reward: -20.5548301336
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 21, 't': 9, 'action': 'left', 'reward': -20.554830133564522, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -20.55)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 2.26621196222
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 2.266211962223144, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.27)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 2.33838731804
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 19, 't': 11, 'action': None, 'reward': 2.3383873180416277, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.34)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 2.19898511565
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 12, 'action': None, 'reward': 2.198985115651577, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.20)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 1.65519916543
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 1.655199165431652, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 1.66)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: -0.128016959469
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 16, 't': 14, 'action': None, 'reward': -0.12801695946947245, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded -0.13)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: right, reward: 1.530715532
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'left'), 'deadline': 15, 't': 15, 'action': 'right', 'reward': 1.5307155320036354, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'left')
Agent followed the waypoint right. (rewarded 1.53)
47% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 153
\-------------------------
Environment.reset(): Trial set up with start = (4, 5), destination = (6, 2), deadline = 25
Simulating trial. . .
epsilon = 0.2187; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 1.58003619827
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.5800361982659827, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 1.58)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 2.28844988638
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.288449886375056, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.29)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 1.23513602967
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.235136029667202, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.24)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 1.69279012687
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.6927901268746268, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.69)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 2.42736314949
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.427363149493166, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.43)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: forward, reward: 2.21776445436
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 2.2177644543554385, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.22)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: None, reward: 2.73309072565
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.7330907256505403, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.73)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: forward, reward: 1.75708688118
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.7570868811750044, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 1.76)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: right, reward: 2.19449358531
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 2.194493585310954, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.19)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: None, reward: 1.73531097307
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.7353109730732397, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.74)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: forward, reward: 2.1821396087
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 2.1821396087003375, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent followed the waypoint forward. (rewarded 2.18)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: None, reward: 1.84903980644
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 14, 't': 11, 'action': None, 'reward': 1.8490398064351425, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.85)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: None, reward: 1.86818556809
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 13, 't': 12, 'action': None, 'reward': 1.8681855680903403, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.87)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: left, reward: 0.22029926911
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 12, 't': 13, 'action': 'left', 'reward': 0.2202992691104988, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 0.22)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: None, reward: -5.19326689243
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'left', 'left'), 'deadline': 11, 't': 14, 'action': None, 'reward': -5.193266892432514, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.19)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: right, reward: 0.72691113211
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 10, 't': 15, 'action': 'right', 'reward': 0.7269111321100619, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 0.73)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: right, reward: 2.0651871008
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 9, 't': 16, 'action': 'right', 'reward': 2.065187100796558, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.07)
32% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 154
\-------------------------
Environment.reset(): Trial set up with start = (8, 3), destination = (3, 6), deadline = 30
Simulating trial. . .
epsilon = 0.2165; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: None, reward: 2.77038329063
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 30, 't': 0, 'action': None, 'reward': 2.7703832906276524, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.77)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: None, reward: 1.11097570409
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.1109757040918105, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.11)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: None, reward: 1.53613143069
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.5361314306938767, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.54)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: None, reward: 1.56508336642
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.5650833664227777, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.57)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: None, reward: 2.41769721301
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 26, 't': 4, 'action': None, 'reward': 2.4176972130105487, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.42)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: left, reward: 1.48741661836
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'left', 'reward': 1.4874166183551951, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.49)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 2.40589152372
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 2.4058915237243057, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.41)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.20828200678
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 7, 'action': None, 'reward': 1.2082820067844031, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.21)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 0.96254702442
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 22, 't': 8, 'action': None, 'reward': 0.9625470244204233, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 0.96)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.19342829194
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 21, 't': 9, 'action': None, 'reward': 2.1934282919419936, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.19)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.46719125049
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 20, 't': 10, 'action': None, 'reward': 2.467191250486517, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.47)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 2.74147169286
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 19, 't': 11, 'action': 'forward', 'reward': 2.741471692860326, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.74)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 1.71884546165
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 12, 'action': None, 'reward': 1.718845461654233, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.72)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: left, reward: 2.31537164836
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 13, 'action': 'left', 'reward': 2.3153716483555127, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.32)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 7), heading: (0, -1), action: forward, reward: 1.20122321519
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 16, 't': 14, 'action': 'forward', 'reward': 1.201223215188727, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.20)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: forward, reward: 2.18637688937
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 15, 'action': 'forward', 'reward': 2.1863768893727995, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.19)
47% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 155
\-------------------------
Environment.reset(): Trial set up with start = (2, 2), destination = (4, 5), deadline = 25
Simulating trial. . .
epsilon = 0.2144; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: right, reward: 1.90720185739
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.907201857386003, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.91)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 1.03467487026
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.0346748702563624, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.03)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 1.59240331497
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.5924033149738246, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.59)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 0.999636172258
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 0.9996361722581573, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.00)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: left, reward: -9.49782069298
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 21, 't': 4, 'action': 'left', 'reward': -9.49782069297786, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -9.50)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: forward, reward: 2.50825729968
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 2.5082572996766057, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.51)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: None, reward: 1.98178565668
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.9817856566818726, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.98)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: left, reward: 2.04715725644
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 2.0471572564392657, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.05)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: forward, reward: 2.27730178908
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 2.2773017890823173, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.28)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: 1.1152673895
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.1152673894992589, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.12)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: forward, reward: 0.914683842492
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 0.9146838424920836, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 0.91)
56% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 156
\-------------------------
Environment.reset(): Trial set up with start = (5, 5), destination = (8, 3), deadline = 25
Simulating trial. . .
epsilon = 0.2122; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: right, reward: 2.23741563847
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.2374156384688693, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.24)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: None, reward: 1.49716410536
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'forward'), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.4971641053635167, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.50)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: right, reward: 1.86811008575
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'left'), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.8681100857477695, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'left')
Agent followed the waypoint right. (rewarded 1.87)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: forward, reward: 1.85060017902
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.8506001790188282, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.85)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: None, reward: 2.6087615512
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.608761551197607, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.61)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: None, reward: 1.30682304591
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.3068230459088654, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.31)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: None, reward: 2.08329384406
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.0832938440646545, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.08)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: forward, reward: 1.99107816057
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.9910781605731738, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.99)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: forward, reward: -10.0220557895
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': -10.022055789511008, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -10.02)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 1.76244082961
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.7624408296138299, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.76)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: left, reward: 2.73577570609
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 15, 't': 10, 'action': 'left', 'reward': 2.735775706087315, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.74)
56% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 157
\-------------------------
Environment.reset(): Trial set up with start = (5, 3), destination = (2, 6), deadline = 30
Simulating trial. . .
epsilon = 0.2101; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: right, reward: -19.5365734143
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 30, 't': 0, 'action': 'right', 'reward': -19.536573414298445, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.54)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: 2.96239797369
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.962397973691844, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.96)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: right, reward: 0.156964900133
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 28, 't': 2, 'action': 'right', 'reward': 0.15696490013329079, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent drove right instead of forward. (rewarded 0.16)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: None, reward: 2.41380550276
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 27, 't': 3, 'action': None, 'reward': 2.413805502758467, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.41)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: left, reward: 1.66189371445
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 26, 't': 4, 'action': 'left', 'reward': 1.6618937144524422, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 1.66)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: right, reward: 0.690976630358
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 25, 't': 5, 'action': 'right', 'reward': 0.6909766303584645, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.69)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: None, reward: 1.81934253585
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 24, 't': 6, 'action': None, 'reward': 1.8193425358514514, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.82)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: left, reward: 2.80653385747
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 23, 't': 7, 'action': 'left', 'reward': 2.8065338574732746, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.81)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: forward, reward: 1.31928654547
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 8, 'action': 'forward', 'reward': 1.319286545469166, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.32)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: left, reward: 0.682117258693
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 21, 't': 9, 'action': 'left', 'reward': 0.6821172586929369, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove left instead of right. (rewarded 0.68)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: right, reward: 1.9924748541
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 20, 't': 10, 'action': 'right', 'reward': 1.992474854103828, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.99)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: right, reward: 2.42307435433
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 2.423074354333605, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.42)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: None, reward: 0.800580489876
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 18, 't': 12, 'action': None, 'reward': 0.8005804898764507, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 0.80)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: right, reward: 2.62086914989
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 17, 't': 13, 'action': 'right', 'reward': 2.6208691498900647, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.62)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: left, reward: 1.14525351497
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 14, 'action': 'left', 'reward': 1.1452535149735916, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.15)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 158
\-------------------------
Environment.reset(): Trial set up with start = (4, 6), destination = (1, 2), deadline = 25
Simulating trial. . .
epsilon = 0.2080; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: right, reward: 1.04173138994
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.0417313899381608, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent followed the waypoint right. (rewarded 1.04)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: right, reward: 1.65257244282
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.6525724428179343, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.65)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: forward, reward: 1.7510975274
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': 1.751097527403093, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.75)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: forward, reward: 1.14140507363
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.1414050736327772, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.14)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: 1.35384462092
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.3538446209202923, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.35)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: left, reward: 2.65036839222
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 2.6503683922209476, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.65)
76% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 159
\-------------------------
Environment.reset(): Trial set up with start = (2, 6), destination = (6, 3), deadline = 35
Simulating trial. . .
epsilon = 0.2060; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: right, reward: 0.447989289022
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 35, 't': 0, 'action': 'right', 'reward': 0.4479892890220123, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 0.45)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: right, reward: 0.0160197877108
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 34, 't': 1, 'action': 'right', 'reward': 0.016019787710768396, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.02)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: 1.75986529927
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 33, 't': 2, 'action': 'forward', 'reward': 1.759865299270346, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.76)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: forward, reward: 2.5990174541
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 32, 't': 3, 'action': 'forward', 'reward': 2.5990174541024347, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.60)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: forward, reward: 2.27877190949
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 31, 't': 4, 'action': 'forward', 'reward': 2.2787719094916863, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.28)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: left, reward: 2.40699103249
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 30, 't': 5, 'action': 'left', 'reward': 2.4069910324912893, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.41)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: forward, reward: 2.83217317377
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 29, 't': 6, 'action': 'forward', 'reward': 2.83217317377297, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.83)
80% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 160
\-------------------------
Environment.reset(): Trial set up with start = (3, 2), destination = (5, 4), deadline = 20
Simulating trial. . .
epsilon = 0.2039; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: None, reward: 1.05300025531
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.0530002553115823, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.05)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: None, reward: 2.89820988778
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.898209887779825, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.90)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: None, reward: 2.52655519938
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.526555199380419, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.53)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: None, reward: 2.6855430415
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.68554304149988, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.69)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: None, reward: 1.15280154892
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.152801548919595, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.15)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: left, reward: 1.66950510955
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 1.6695051095481523, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.67)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: None, reward: 1.78193789455
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.7819378945457922, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.78)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: 1.58758858025
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.5875885802502065, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.59)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: right, reward: 1.09470917735
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.0947091773456246, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.09)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 1.11445230334
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.1144523033404592, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.11)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 0.764777428496
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 10, 't': 10, 'action': None, 'reward': 0.7647774284955666, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.76)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: forward, reward: 2.20669478575
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 2.2066947857544434, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.21)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 161
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (5, 4), deadline = 20
Simulating trial. . .
epsilon = 0.2019; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: right, reward: 0.754628804939
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 0.7546288049387362, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'left')
Agent drove right instead of forward. (rewarded 0.75)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: None, reward: 1.82206894595
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.8220689459501203, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.82)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: None, reward: 1.70604036647
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.7060403664723736, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.71)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: None, reward: 1.5353698991
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.5353698990994344, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 1.54)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: left, reward: 2.80345757763
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.803457577626329, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.80)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: right, reward: 2.18911368232
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 2.1891136823196518, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.19)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: forward, reward: 2.00270715288
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 2.0027071528842075, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.00)
65% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 162
\-------------------------
Environment.reset(): Trial set up with start = (5, 7), destination = (6, 4), deadline = 20
Simulating trial. . .
epsilon = 0.1999; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: forward, reward: 2.01250664166
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 2.0125066416591686, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.01)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: forward, reward: 1.86447246295
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 1.864472462951516, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 1.86)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: right, reward: 1.71913922977
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.7191392297687214, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.72)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: right, reward: 1.01126153602
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.0112615360188848, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.01)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: forward, reward: -10.0382813405
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': -10.03828134053732, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.04)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: left, reward: 1.7815191622
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 1.7815191622039062, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.78)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 4), heading: (0, 1), action: forward, reward: 2.03310930423
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 2.0331093042256994, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.03)
65% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 163
\-------------------------
Environment.reset(): Trial set up with start = (2, 4), destination = (5, 3), deadline = 20
Simulating trial. . .
epsilon = 0.1979; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: right, reward: 2.76878976103
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.7687897610308756, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.77)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: forward, reward: -40.7187276408
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': -40.718727640804985, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.72)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 1.66141011669
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.6614101166883695, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.66)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 1.97167802125
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.9716780212532192, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.97)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 1.28002624557
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.2800262455665257, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.28)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: left, reward: 1.12626238683
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 1.1262623868292845, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.13)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: left, reward: -40.9473713003
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 14, 't': 6, 'action': 'left', 'reward': -40.94737130026185, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.95)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: right, reward: 2.40629440488
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 2.406294404883057, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.41)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: right, reward: 0.639955504514
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 0.6399555045141726, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent drove right instead of forward. (rewarded 0.64)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: right, reward: 1.2692513675
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 1.2692513674976342, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 1.27)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: forward, reward: -9.23407993303
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': -9.234079933031397, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.23)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: right, reward: 2.09895616313
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', 'left'), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 2.0989561631343836, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'left')
Agent followed the waypoint right. (rewarded 2.10)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: right, reward: 2.36068097702
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 2.3606809770224286, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.36)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: right, reward: 0.829399803637
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 7, 't': 13, 'action': 'right', 'reward': 0.8293998036366682, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 0.83)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: None, reward: 1.73372374111
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.7337237411051487, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.73)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: None, reward: 2.20436303858
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 5, 't': 15, 'action': None, 'reward': 2.2043630385781365, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.20)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: left, reward: 1.9615472446
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 4, 't': 16, 'action': 'left', 'reward': 1.9615472446041626, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.96)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: None, reward: 0.928394304443
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 3, 't': 17, 'action': None, 'reward': 0.9283943044425023, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.93)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: left, reward: -9.96754218772
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 2, 't': 18, 'action': 'left', 'reward': -9.967542187717791, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.97)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: forward, reward: -10.4804209683
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 1, 't': 19, 'action': 'forward', 'reward': -10.480420968332217, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.48)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 164
\-------------------------
Environment.reset(): Trial set up with start = (7, 7), destination = (2, 3), deadline = 25
Simulating trial. . .
epsilon = 0.1959; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: right, reward: 2.42486349339
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.424863493394472, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.42)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 1.15570425301
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.1557042530073003, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.16)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 2.31448964559
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.3144896455922535, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.31)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 2.51132238571
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.5113223857064924, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.51)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: 1.1377133209
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.1377133209028278, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.14)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 2.17566535019
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.1756653501862706, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.18)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 1.14822095806
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.148220958058684, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.15)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: forward, reward: 1.04641866742
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.0464186674227673, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.05)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: right, reward: 1.95913539513
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 1.9591353951263932, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.96)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 1.40587555401
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.4058755540122239, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.41)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 1.08499560728
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.0849956072762812, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.08)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: forward, reward: 0.808967720708
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 0.8089677207079671, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.81)
52% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 165
\-------------------------
Environment.reset(): Trial set up with start = (2, 4), destination = (5, 6), deadline = 25
Simulating trial. . .
epsilon = 0.1940; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: None, reward: 2.0510269462
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 25, 't': 0, 'action': None, 'reward': 2.0510269462028283, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.05)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: None, reward: 2.25089713048
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.2508971304847494, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.25)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: None, reward: 1.77810482442
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.7781048244180981, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.78)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: None, reward: 2.51967047836
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.519670478355411, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.52)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: left, reward: 0.984961985901
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 0.9849619859010386, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 0.98)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: 1.74723114759
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.7472311475901026, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.75)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: 2.44030675312
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.4403067531155798, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.44)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: left, reward: 1.54866748873
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 1.5486674887281306, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.55)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: 0.957381534344
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 0.9573815343443817, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.96)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 0.867141033333
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 16, 't': 9, 'action': None, 'reward': 0.8671410333333895, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 0.87)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 1.82250308647
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.822503086474122, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.82)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 2.58085954943
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 14, 't': 11, 'action': None, 'reward': 2.580859549425676, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.58)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: -39.0350564022
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 13, 't': 12, 'action': 'forward', 'reward': -39.03505640216372, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.04)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: forward, reward: 0.971651347743
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 0.9716513477425786, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.97)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: right, reward: 1.25929972006
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 11, 't': 14, 'action': 'right', 'reward': 1.2592997200628635, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.26)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 166
\-------------------------
Environment.reset(): Trial set up with start = (7, 3), destination = (5, 7), deadline = 20
Simulating trial. . .
epsilon = 0.1920; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: left, reward: 1.43958681528
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.4395868152846167, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.44)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: forward, reward: -10.8840240433
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': -10.884024043309235, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -10.88)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: None, reward: 2.62153959129
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.6215395912942854, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.62)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: forward, reward: 0.0470261315661
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 0.04702613156612123, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove forward instead of left. (rewarded 0.05)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: None, reward: 2.59955829221
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.599558292207257, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.60)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: None, reward: 2.22155901057
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.22155901057147, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.22)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: None, reward: 2.07635342196
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.0763534219608695, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.08)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: left, reward: 2.54985263618
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 2.5498526361817566, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.55)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: -9.2084273508
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'right', 'right'), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': -9.208427350798845, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'right')
Agent attempted driving forward through a red light. (rewarded -9.21)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: 1.71389486491
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 1.7138948649059595, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.71)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 167
\-------------------------
Environment.reset(): Trial set up with start = (7, 6), destination = (4, 5), deadline = 20
Simulating trial. . .
epsilon = 0.1901; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: right, reward: 1.94585262524
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.945852625242929, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.95)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: forward, reward: 2.88345196384
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 2.883451963842817, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.88)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 1.21100528961
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.2110052896094092, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.21)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 1.04265489258
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.0426548925822166, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.04)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: left, reward: -10.014608776
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': -10.014608775952594, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.01)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 1.34406664331
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.34406664331049, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.34)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: forward, reward: 2.62228711189
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 2.6222871118919224, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent followed the waypoint forward. (rewarded 2.62)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: right, reward: 2.34547238289
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 2.3454723828918, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.35)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 168
\-------------------------
Environment.reset(): Trial set up with start = (2, 2), destination = (6, 7), deadline = 25
Simulating trial. . .
epsilon = 0.1882; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: right, reward: 2.8267569807
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'left'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.8267569807011848, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'left')
Agent followed the waypoint right. (rewarded 2.83)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: left, reward: -10.0344322427
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': 'left', 'reward': -10.03443224265968, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -10.03)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 2.76302521335
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.7630252133503865, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.76)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 2.53054483603
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.5305448360325755, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.53)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 2.77885106003
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.7788510600300724, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.78)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 2.20303091743
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.2030309174318914, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.20)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 2.75514563929
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.7551456392933673, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.76)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: 2.04772516222
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.0477251622151345, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.05)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: forward, reward: 2.59772796928
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 2.5977279692832793, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.60)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: right, reward: 2.70313139115
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 2.7031313911490873, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.70)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 169
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (3, 2), deadline = 20
Simulating trial. . .
epsilon = 0.1864; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: right, reward: 1.526393966
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.5263939659970795, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 1.53)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: forward, reward: -10.7747213936
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': -10.774721393583489, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.77)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: None, reward: 1.32559231698
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.3255923169814945, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.33)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: None, reward: 1.42813565721
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.4281356572088786, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.43)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: forward, reward: 1.80102681649
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.8010268164871326, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.80)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: forward, reward: -10.6793943282
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': -10.679394328163212, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent attempted driving forward through a red light. (rewarded -10.68)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 2.33332235271
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.333322352710998, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.33)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: -4.09551498699
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': -4.095514986985876, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.10)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: 2.33418957919
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 2.334189579188135, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.33)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: forward, reward: 0.892925140608
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 0.8929251406084535, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 0.89)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: forward, reward: 0.90993734859
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 0.9099373485897615, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent followed the waypoint forward. (rewarded 0.91)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: right, reward: 2.39894557592
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 2.3989455759194285, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.40)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 170
\-------------------------
Environment.reset(): Trial set up with start = (2, 7), destination = (8, 4), deadline = 25
Simulating trial. . .
epsilon = 0.1845; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: forward, reward: 2.48986837252
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 2.4898683725176163, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.49)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: left, reward: -39.3058946404
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 24, 't': 1, 'action': 'left', 'reward': -39.30589464040618, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.31)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: 2.19112139557
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.1911213955715203, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.19)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: forward, reward: 2.5470176755
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.547017675495398, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.55)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: left, reward: 2.83930609623
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 2.839306096234986, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.84)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: forward, reward: 1.69984346475
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.6998434647484324, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.70)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: right, reward: 0.327126539583
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 0.3271265395831301, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.33)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: None, reward: 2.25358324722
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 18, 't': 7, 'action': None, 'reward': 2.2535832472158974, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.25)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: None, reward: 2.32542773154
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.3254277315415846, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.33)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: right, reward: 0.24093561798
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 0.24093561798005259, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.24)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: right, reward: 1.79157287673
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 1.791572876726078, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.79)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: right, reward: 2.62150630236
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 2.62150630235631, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.62)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: None, reward: 2.30064293329
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 13, 't': 12, 'action': None, 'reward': 2.300642933290262, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.30)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: forward, reward: 1.49121351323
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 1.4912135132254785, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.49)
44% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 171
\-------------------------
Environment.reset(): Trial set up with start = (4, 6), destination = (7, 2), deadline = 25
Simulating trial. . .
epsilon = 0.1827; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: forward, reward: 1.6841884682
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 1.6841884682047472, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.68)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 1.7934312902
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.793431290202615, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.79)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 2.65786096375
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.6578609637503776, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.66)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 1.36358123494
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.3635812349360332, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.36)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: left, reward: -10.3086917431
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 21, 't': 4, 'action': 'left', 'reward': -10.308691743096668, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent attempted driving left through a red light. (rewarded -10.31)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 1.3117827516
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.3117827516016942, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.31)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: forward, reward: 2.11533784264
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 2.1153378426357117, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.12)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: forward, reward: 2.15590167609
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.1559016760929106, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.16)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: right, reward: 1.74548309574
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 1.7454830957387342, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.75)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 2.63156416967
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': 2.6315641696711927, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.63)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: forward, reward: 2.59163528871
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 2.591635288714918, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.59)
56% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 172
\-------------------------
Environment.reset(): Trial set up with start = (2, 3), destination = (7, 2), deadline = 20
Simulating trial. . .
epsilon = 0.1809; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.17680022839
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.1768002283921828, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.18)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.52834526759
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.528345267592924, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.53)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: left, reward: -9.54802669855
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': -9.548026698546028, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.55)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.77918018542
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.7791801854153855, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.78)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.30928852472
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.3092885247172346, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.31)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.78237921811
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.78237921810659, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.78)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 0.936561769849
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 0.9365617698492072, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove forward instead of left. (rewarded 0.94)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: -9.77097092971
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': -9.770970929707852, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.77)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 1.36214325521
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.362143255214367, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.36)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: left, reward: 1.50584821285
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'left', 'reward': 1.5058482128487642, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.51)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: left, reward: 0.931928888772
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'left', 'reward': 0.931928888772487, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.93)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 2.01058750628
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 2.0105875062817855, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.01)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 1.46111652396
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': 1.4611165239578277, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 1.46)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 2.33588191909
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 7, 't': 13, 'action': None, 'reward': 2.3358819190855664, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.34)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 1.49876349542
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'right'), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.4987634954189684, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 1.50)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 2.10354317446
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 5, 't': 15, 'action': None, 'reward': 2.103543174463863, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.10)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: right, reward: 1.25725892613
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 4, 't': 16, 'action': 'right', 'reward': 1.2572589261283982, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 1.26)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: left, reward: 1.57214076674
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 3, 't': 17, 'action': 'left', 'reward': 1.5721407667426757, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.57)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: -5.36068422536
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 2, 't': 18, 'action': None, 'reward': -5.360684225361174, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.36)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.19257495602
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 1, 't': 19, 'action': None, 'reward': 1.1925749560202077, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.19)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 173
\-------------------------
Environment.reset(): Trial set up with start = (6, 2), destination = (4, 5), deadline = 25
Simulating trial. . .
epsilon = 0.1791; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: None, reward: 1.82275718919
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.822757189193541, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.82)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: right, reward: 1.62287708311
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.6228770831060422, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.62)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: None, reward: 2.09419876774
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.0941987677350244, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.09)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: None, reward: 2.81704411266
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.817044112664302, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.82)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: forward, reward: 2.10721107938
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.107211079379578, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.11)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: -0.0165052736572
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 20, 't': 5, 'action': None, 'reward': -0.016505273657179242, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded -0.02)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: right, reward: 1.60514722229
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 1.6051472222924716, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.61)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: None, reward: 1.13194550449
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 18, 't': 7, 'action': None, 'reward': 1.1319455044946127, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.13)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: None, reward: 1.42308319033
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.4230831903265504, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.42)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: None, reward: 2.7905847012
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 16, 't': 9, 'action': None, 'reward': 2.7905847012049536, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.79)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: forward, reward: 1.25938682902
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 1.2593868290224002, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.26)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: 1.83865131783
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 11, 'action': None, 'reward': 1.8386513178303825, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.84)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: 1.18328818108
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 13, 't': 12, 'action': None, 'reward': 1.18328818108064, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.18)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: forward, reward: 2.27035423903
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 2.2703542390252855, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.27)
44% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 174
\-------------------------
Environment.reset(): Trial set up with start = (6, 6), destination = (3, 5), deadline = 20
Simulating trial. . .
epsilon = 0.1773; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: right, reward: 0.274526763778
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'right'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 0.2745267637778819, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'right')
Agent drove right instead of left. (rewarded 0.27)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: right, reward: 1.42583636258
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.425836362577708, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.43)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: None, reward: 1.39302674073
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.3930267407322812, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.39)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: -9.32998846632
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': -9.329988466318994, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.33)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: None, reward: 2.74152242496
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.7415224249557184, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.74)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: forward, reward: 1.61430218466
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.6143021846603678, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.61)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: None, reward: 2.73912513435
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.7391251343535785, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.74)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: forward, reward: 2.21774344035
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.217743440345472, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.22)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: right, reward: 2.13994274087
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 2.139942740865753, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.14)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 5), heading: (0, -1), action: forward, reward: 1.82760454931
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 1.8276045493149047, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.83)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 175
\-------------------------
Environment.reset(): Trial set up with start = (3, 7), destination = (8, 6), deadline = 20
Simulating trial. . .
epsilon = 0.1755; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 2.25131648067
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.2513164806719836, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.25)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 2.7043019291
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.704301929102969, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.70)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 2.79103294104
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.791032941043506, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.79)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: right, reward: 1.26973130317
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.2697313031705404, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 1.27)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: right, reward: 1.14429996773
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 1.1442999677305434, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.14)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 1.72304053366
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.7230405336550823, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.72)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 1.47975120758
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.479751207577805, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.48)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: right, reward: 2.18979771792
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 2.189797717915667, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.19)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: right, reward: 1.33450293624
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'right'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.3345029362382554, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'right')
Agent drove right instead of forward. (rewarded 1.33)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 1.05528815891
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'left'), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.0552881589066159, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'left')
Agent properly idled at a red light. (rewarded 1.06)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: right, reward: -19.0909982351
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 10, 't': 10, 'action': 'right', 'reward': -19.090998235113634, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.09)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: left, reward: 1.33553216358
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 1.335532163581497, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.34)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: left, reward: 2.56634458155
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 8, 't': 12, 'action': 'left', 'reward': 2.5663445815493695, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.57)
35% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 176
\-------------------------
Environment.reset(): Trial set up with start = (1, 2), destination = (7, 5), deadline = 25
Simulating trial. . .
epsilon = 0.1738; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: None, reward: 1.62925624346
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'forward'), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.6292562434636388, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 1.63)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: None, reward: 1.27062998244
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.2706299824432448, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.27)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: left, reward: -9.08810950613
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 2, 'action': 'left', 'reward': -9.088109506126372, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.09)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: left, reward: 1.41307669526
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'left', 'reward': 1.4130766952601894, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.41)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: right, reward: 1.29814823605
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 1.2981482360504906, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent drove right instead of forward. (rewarded 1.30)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: left, reward: -10.8476657844
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': -10.847665784357213, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.85)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: None, reward: 1.6894911777
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.68949117770169, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.69)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: left, reward: 1.47251797204
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 1.472517972043789, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.47)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: right, reward: 1.83774467428
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 1.8377446742766577, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.84)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: None, reward: 1.57560682538
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.5756068253785993, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.58)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: None, reward: 1.07721455661
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.0772145566067446, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.08)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: right, reward: 0.97018580665
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 0.9701858066497285, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent drove right instead of forward. (rewarded 0.97)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 5), heading: (0, -1), action: left, reward: 1.54164065841
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 13, 't': 12, 'action': 'left', 'reward': 1.541640658413102, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.54)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: forward, reward: 0.0913203565893
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 0.09132035658932991, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove forward instead of left. (rewarded 0.09)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: 2.4068604357
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'forward'), 'deadline': 11, 't': 14, 'action': None, 'reward': 2.4068604357012395, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 2.41)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: 1.90921504355
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 10, 't': 15, 'action': None, 'reward': 1.9092150435541204, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.91)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: right, reward: 0.980304040211
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 9, 't': 16, 'action': 'right', 'reward': 0.9803040402106936, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 0.98)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: right, reward: 1.41342482363
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 8, 't': 17, 'action': 'right', 'reward': 1.413424823634792, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.41)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: None, reward: 0.761631499467
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 7, 't': 18, 'action': None, 'reward': 0.7616314994668563, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 0.76)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: right, reward: 1.79901028347
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 6, 't': 19, 'action': 'right', 'reward': 1.7990102834700161, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.80)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 0.97683950882
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 5, 't': 20, 'action': 'forward', 'reward': 0.9768395088199344, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 0.98)
16% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 177
\-------------------------
Environment.reset(): Trial set up with start = (8, 5), destination = (4, 5), deadline = 20
Simulating trial. . .
epsilon = 0.1720; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: left, reward: 1.3646292132
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.3646292132030602, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.36)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: forward, reward: 2.52486260783
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 2.524862607826316, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.52)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: forward, reward: 1.97633552247
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 1.9763355224747319, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.98)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 2.44192353754
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.441923537542982, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.44)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: 1.68003270055
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.6800327005474693, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.68)
75% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 178
\-------------------------
Environment.reset(): Trial set up with start = (5, 3), destination = (8, 4), deadline = 20
Simulating trial. . .
epsilon = 0.1703; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: left, reward: 1.99272627697
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.9927262769722844, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.99)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: forward, reward: -9.18400982507
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': -9.184009825070214, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.18)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: None, reward: 1.7659116178
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.765911617800772, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.77)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: forward, reward: 2.85552617393
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.8555261739345044, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.86)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: None, reward: 2.38156328501
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.381563285005864, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.38)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: forward, reward: 1.4114617064
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.4114617064027104, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.41)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 1.12897076105
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.1289707610506063, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 1.13)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: right, reward: 2.01852526766
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 2.0185252676563143, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent followed the waypoint right. (rewarded 2.02)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: right, reward: 2.46507798595
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 2.465077985946858, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.47)
55% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 179
\-------------------------
Environment.reset(): Trial set up with start = (7, 7), destination = (4, 3), deadline = 25
Simulating trial. . .
epsilon = 0.1686; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: right, reward: 2.25961788361
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.259617883613281, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.26)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: right, reward: 1.51059317707
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'right'), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.5105931770653724, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'right')
Agent followed the waypoint right. (rewarded 1.51)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: forward, reward: 2.29080684601
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': 2.2908068460077677, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 2.29)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: None, reward: 1.51037843756
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.5103784375615195, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.51)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: forward, reward: 1.5386380598
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.5386380598003244, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.54)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 1.43459418135
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.4345941813493577, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.43)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 1.64389926837
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.643899268367721, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.64)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: left, reward: 2.69348143565
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 2.6934814356526626, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.69)
68% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 180
\-------------------------
Environment.reset(): Trial set up with start = (3, 4), destination = (6, 3), deadline = 20
Simulating trial. . .
epsilon = 0.1670; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: right, reward: 2.05409222977
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.0540922297670945, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 2.05)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: right, reward: 1.07957111526
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.079571115262656, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent drove right instead of forward. (rewarded 1.08)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: left, reward: 2.85388218206
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 18, 't': 2, 'action': 'left', 'reward': 2.8538821820581033, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 2.85)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: forward, reward: 1.93212207224
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.9321220722438652, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.93)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: None, reward: 2.62852285736
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.6285228573590205, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.63)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: left, reward: 2.90953640474
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 2.9095364047384407, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.91)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: forward, reward: 2.76043687509
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 2.7604368750917256, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.76)
65% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 181
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (5, 3), deadline = 35
Simulating trial. . .
epsilon = 0.1653; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 1.12757273084
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 35, 't': 0, 'action': 'right', 'reward': 1.127572730838848, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.13)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 2.6198100574
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'left'), 'deadline': 34, 't': 1, 'action': 'right', 'reward': 2.6198100574026766, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'left')
Agent followed the waypoint right. (rewarded 2.62)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 1.51272391476
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 33, 't': 2, 'action': 'forward', 'reward': 1.512723914764579, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.51)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.37330975773
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 32, 't': 3, 'action': None, 'reward': 2.3733097577319313, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.37)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: right, reward: 0.960343697253
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 31, 't': 4, 'action': 'right', 'reward': 0.9603436972525043, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent drove right instead of forward. (rewarded 0.96)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: None, reward: 1.42000667693
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 30, 't': 5, 'action': None, 'reward': 1.4200066769334432, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.42)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: left, reward: 2.92265395368
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 29, 't': 6, 'action': 'left', 'reward': 2.9226539536834606, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.92)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: None, reward: 1.64503246225
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 28, 't': 7, 'action': None, 'reward': 1.6450324622460173, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.65)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: None, reward: 2.47947514548
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 27, 't': 8, 'action': None, 'reward': 2.4794751454771005, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.48)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: None, reward: 1.68163810483
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 26, 't': 9, 'action': None, 'reward': 1.6816381048306068, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.68)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: right, reward: 1.0847924091
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 25, 't': 10, 'action': 'right', 'reward': 1.0847924091048373, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 1.08)
69% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: left, reward: -39.7842967197
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 24, 't': 11, 'action': 'left', 'reward': -39.78429671966053, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.78)
66% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: forward, reward: -9.56607630202
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 12, 'action': 'forward', 'reward': -9.566076302020294, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.57)
63% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: left, reward: 0.981551063072
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 22, 't': 13, 'action': 'left', 'reward': 0.9815510630721029, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 0.98)
60% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: right, reward: 1.87861990684
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 21, 't': 14, 'action': 'right', 'reward': 1.878619906839332, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.88)
57% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: forward, reward: 0.841848040326
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 20, 't': 15, 'action': 'forward', 'reward': 0.8418480403263382, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 0.84)
54% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 182
\-------------------------
Environment.reset(): Trial set up with start = (7, 2), destination = (1, 4), deadline = 20
Simulating trial. . .
epsilon = 0.1637; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: None, reward: 2.79608408461
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.7960840846147583, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.80)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: None, reward: 1.16636088579
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.1663608857882404, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.17)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: None, reward: 1.62234143232
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.6223414323235406, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.62)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: 1.96516431977
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.9651643197663657, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.97)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: left, reward: -9.56772959734
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 16, 't': 4, 'action': 'left', 'reward': -9.567729597341017, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -9.57)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: 1.24553556905
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.2455355690531937, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.25)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: 1.28665928487
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.286659284871466, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.29)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: forward, reward: 2.50206938533
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.502069385332085, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.50)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: right, reward: 1.92157565477
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.9215756547719638, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.92)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: forward, reward: -9.20186707443
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': -9.201867074433837, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent attempted driving forward through a red light. (rewarded -9.20)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: None, reward: 1.71920764991
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.7192076499119648, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.72)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: left, reward: 0.51886282072
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 0.5188628207195993, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove left instead of forward. (rewarded 0.52)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 1.33377457137
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 1.3337745713674407, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 1.33)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: None, reward: 0.136901531496
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 7, 't': 13, 'action': None, 'reward': 0.13690153149647866, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 0.14)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: right, reward: 2.28303881319
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 6, 't': 14, 'action': 'right', 'reward': 2.28303881319117, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.28)
25% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 183
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (3, 3), deadline = 25
Simulating trial. . .
epsilon = 0.1620; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: right, reward: 1.02370429415
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.0237042941514838, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.02)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: None, reward: -4.77875391974
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 24, 't': 1, 'action': None, 'reward': -4.778753919744111, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.78)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: forward, reward: 2.63092398824
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': 2.6309239882420963, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.63)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: None, reward: 1.7023416942
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.7023416942049403, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.70)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: forward, reward: 1.64822165997
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.6482216599666644, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.65)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: None, reward: 2.80853241474
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.808532414738558, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.81)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: forward, reward: -9.40789914512
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': -9.407899145122133, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.41)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: left, reward: 0.99409503184
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 0.9940950318399953, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.99)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: None, reward: 1.31346679449
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.3134667944905352, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.31)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: forward, reward: 1.23506843095
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 1.2350684309490836, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.24)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 184
\-------------------------
Environment.reset(): Trial set up with start = (3, 7), destination = (5, 3), deadline = 20
Simulating trial. . .
epsilon = 0.1604; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: left, reward: 1.14813462834
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.1481346283365823, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 1.15)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: None, reward: 1.21195666247
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.2119566624668183, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.21)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: None, reward: 1.40708094174
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.4070809417390588, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.41)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: None, reward: 1.92013470983
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.9201347098318764, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.92)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: right, reward: 0.872439736936
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 0.8724397369361897, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 0.87)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: right, reward: 2.64745448473
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'left'), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 2.647454484730676, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'left')
Agent followed the waypoint right. (rewarded 2.65)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 2.3978824482
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.397882448197687, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.40)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: 2.63846951167
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.6384695116735575, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.64)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: None, reward: 1.92101978385
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.921019783846059, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.92)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: None, reward: 2.45161490159
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 11, 't': 9, 'action': None, 'reward': 2.4516149015946347, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.45)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: forward, reward: 0.777190979844
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 0.7771909798441534, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.78)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: right, reward: 1.16213178219
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 1.1621317821884025, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.16)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: None, reward: 2.59951093739
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 8, 't': 12, 'action': None, 'reward': 2.5995109373936605, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.60)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: left, reward: -19.279309175
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 7, 't': 13, 'action': 'left', 'reward': -19.279309175008187, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.28)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: forward, reward: 0.579199153381
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 6, 't': 14, 'action': 'forward', 'reward': 0.5791991533807923, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 0.58)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: right, reward: 0.677617838311
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 5, 't': 15, 'action': 'right', 'reward': 0.6776178383107543, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent drove right instead of forward. (rewarded 0.68)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: left, reward: 1.451527554
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 4, 't': 16, 'action': 'left', 'reward': 1.4515275539971255, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.45)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: None, reward: 0.346354766912
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 3, 't': 17, 'action': None, 'reward': 0.3463547669115572, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.35)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: None, reward: 1.12779020676
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 2, 't': 18, 'action': None, 'reward': 1.1277902067638899, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.13)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: left, reward: 2.00991132706
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 1, 't': 19, 'action': 'left', 'reward': 2.009911327060826, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.01)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 185
\-------------------------
Environment.reset(): Trial set up with start = (5, 6), destination = (3, 3), deadline = 25
Simulating trial. . .
epsilon = 0.1588; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: right, reward: 2.78413639161
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'left'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.784136391614065, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'left')
Agent followed the waypoint right. (rewarded 2.78)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: right, reward: 1.50464150291
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.5046415029100904, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.50)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: forward, reward: 2.04847801614
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': 2.048478016144574, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.05)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: None, reward: 1.07550074965
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.0755007496545947, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.08)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: None, reward: 1.50348195299
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.5034819529932657, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.50)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: forward, reward: 1.21184415854
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.211844158539665, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 1.21)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: left, reward: 1.26514080918
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'left', 'reward': 1.2651408091755292, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.27)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 1.71474988719
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 18, 't': 7, 'action': None, 'reward': 1.7147498871942932, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.71)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 1.73554448504
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.7355444850373958, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.74)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: -5.25348022046
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 16, 't': 9, 'action': None, 'reward': -5.253480220459524, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.25)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: forward, reward: 1.34662835904
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'forward'), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 1.3466283590432773, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'forward')
Agent drove forward instead of left. (rewarded 1.35)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: left, reward: 1.64140775914
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'left', 'reward': 1.6414077591360803, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.64)
52% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 186
\-------------------------
Environment.reset(): Trial set up with start = (8, 7), destination = (2, 3), deadline = 20
Simulating trial. . .
epsilon = 0.1572; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 2.08982218974
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.089822189735731, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.09)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: forward, reward: -39.2281446714
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': -39.22814467141258, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.23)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 1.65446835595
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.654468355950151, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.65)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 1.79075862481
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.7907586248134653, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.79)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: 2.63432502681
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.634325026810374, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.63)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 1.92642969508
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.926429695083889, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.93)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 2.83518858132
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.835188581318463, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.84)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: forward, reward: 1.67225804034
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.6722580403395708, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.67)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: right, reward: 1.13720735655
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.1372073565541074, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent followed the waypoint right. (rewarded 1.14)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 2.4553893939
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 11, 't': 9, 'action': None, 'reward': 2.4553893938964197, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.46)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 2.56212098159
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 10, 't': 10, 'action': None, 'reward': 2.562120981593334, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.56)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: forward, reward: 0.871516454707
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 0.8715164547070293, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 0.87)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 187
\-------------------------
Environment.reset(): Trial set up with start = (3, 6), destination = (6, 7), deadline = 20
Simulating trial. . .
epsilon = 0.1557; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: 1.50178436985
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.5017843698544067, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.50)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: None, reward: 2.87655626139
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.8765562613889513, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.88)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: right, reward: 0.968902600933
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'right'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 0.9689026009330087, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'right')
Agent drove right instead of forward. (rewarded 0.97)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: 1.8827324349
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.8827324348998942, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.88)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: right, reward: 0.568023034449
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 0.5680230344485288, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.57)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: right, reward: 2.73855781673
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 2.738557816728645, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent followed the waypoint right. (rewarded 2.74)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: right, reward: 2.27873434577
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'right'), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 2.2787343457740477, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'right')
Agent followed the waypoint right. (rewarded 2.28)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: right, reward: 0.21596614281
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 0.21596614281028303, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.22)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: 1.23402828625
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.234028286247606, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.23)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: 0.977369467454
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 11, 't': 9, 'action': None, 'reward': 0.9773694674541176, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.98)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: left, reward: 1.44662972496
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 10, 't': 10, 'action': 'left', 'reward': 1.4466297249587352, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.45)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: forward, reward: 1.25667528132
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 1.2566752813247526, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.26)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 188
\-------------------------
Environment.reset(): Trial set up with start = (3, 4), destination = (7, 4), deadline = 20
Simulating trial. . .
epsilon = 0.1541; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: forward, reward: 1.32052018084
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.3205201808400386, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.32)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: forward, reward: 2.912168746
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 2.912168746002918, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.91)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: forward, reward: 1.17254006252
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 1.1725400625161193, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.17)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: right, reward: 0.360846948373
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'right'), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 0.3608469483730551, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'right')
Agent drove right instead of forward. (rewarded 0.36)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: left, reward: 1.63218380896
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 1.6321838089619711, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.63)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 4), heading: (0, 1), action: left, reward: 2.12006239305
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 2.120062393049831, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.12)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 189
\-------------------------
Environment.reset(): Trial set up with start = (2, 3), destination = (4, 7), deadline = 20
Simulating trial. . .
epsilon = 0.1526; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: right, reward: 2.85668763002
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.8566876300236115, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.86)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 2.16504871333
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.165048713331952, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.17)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 2.86449912438
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.864499124375185, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.86)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: 1.87286881567
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.8728688156744768, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.87)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: left, reward: 1.99123705063
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 1.9912370506347925, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 1.99)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: forward, reward: 1.51072171078
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.5107217107777284, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.51)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 190
\-------------------------
Environment.reset(): Trial set up with start = (1, 4), destination = (6, 6), deadline = 25
Simulating trial. . .
epsilon = 0.1511; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: right, reward: 1.93897635838
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.938976358379196, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.94)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: forward, reward: 1.6348769756
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 1.6348769755972714, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.63)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 1.2150560314
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.2150560314000067, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.22)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: left, reward: -10.9724215147
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 22, 't': 3, 'action': 'left', 'reward': -10.972421514653442, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent attempted driving left through a red light. (rewarded -10.97)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: forward, reward: 2.32656191652
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.3265619165185445, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.33)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: right, reward: 0.608024178562
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 0.6080241785620458, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.61)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: forward, reward: 1.00332088155
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 1.003320881550635, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.00)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: forward, reward: 2.56618738517
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.5661873851697132, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.57)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: forward, reward: 0.939376695864
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 0.9393766958636931, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 0.94)
64% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 191
\-------------------------
Environment.reset(): Trial set up with start = (6, 2), destination = (1, 4), deadline = 25
Simulating trial. . .
epsilon = 0.1496; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: right, reward: 1.37596818147
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.3759681814725149, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.38)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: 1.22203503955
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 1.2220350395535853, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.22)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: forward, reward: 2.76576622024
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': 2.765766220241619, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 2.77)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 0.306652039963
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', 'forward'), 'deadline': 22, 't': 3, 'action': None, 'reward': 0.3066520399634025, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 0.31)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: right, reward: 2.78505373553
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 2.785053735532224, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 2.79)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: forward, reward: 1.82875859259
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.8287585925894585, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 1.83)
76% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 192
\-------------------------
Environment.reset(): Trial set up with start = (7, 4), destination = (2, 2), deadline = 25
Simulating trial. . .
epsilon = 0.1481; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 4), heading: (0, 1), action: None, reward: 2.46010079486
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 25, 't': 0, 'action': None, 'reward': 2.46010079486122, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.46)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 4), heading: (0, 1), action: None, reward: 1.70903904449
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.7090390444853487, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 1.71)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 4), heading: (0, 1), action: None, reward: 1.64452363358
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.6445236335814855, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.64)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 4), heading: (0, 1), action: None, reward: 1.22848548876
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.228485488764844, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.23)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: left, reward: 2.84835432011
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 2.848354320108349, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.85)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: forward, reward: 1.68726621138
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.687266211383102, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.69)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: forward, reward: 2.36649064409
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 2.3664906440874605, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.37)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 1.02017172599
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 7, 'action': None, 'reward': 1.020171725990273, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.02)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 1.5914989312
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.5914989312017334, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.59)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 1.45525198776
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.4552519877647054, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.46)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: right, reward: 0.340685047594
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 0.3406850475937596, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.34)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: forward, reward: -9.76608336506
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': -9.766083365062489, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent attempted driving forward through a red light. (rewarded -9.77)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: 1.89818623717
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 12, 'action': None, 'reward': 1.8981862371702314, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.90)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: forward, reward: 2.33987274446
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 2.3398727444641696, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.34)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: None, reward: 1.66829055419
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 14, 'action': None, 'reward': 1.6682905541946793, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.67)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: None, reward: 1.29120388083
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 10, 't': 15, 'action': None, 'reward': 1.2912038808268889, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.29)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: None, reward: 1.83147001854
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 9, 't': 16, 'action': None, 'reward': 1.8314700185379658, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.83)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: forward, reward: 0.885466567566
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 8, 't': 17, 'action': 'forward', 'reward': 0.885466567566106, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.89)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: None, reward: 2.27889169039
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 7, 't': 18, 'action': None, 'reward': 2.2788916903873924, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.28)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: forward, reward: 0.953878778559
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 6, 't': 19, 'action': 'forward', 'reward': 0.9538787785587661, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 0.95)
20% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 193
\-------------------------
Environment.reset(): Trial set up with start = (3, 7), destination = (6, 4), deadline = 30
Simulating trial. . .
epsilon = 0.1466; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: right, reward: 1.74996312443
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 1.749963124432259, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 1.75)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: left, reward: -9.0548775354
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 29, 't': 1, 'action': 'left', 'reward': -9.054877535396384, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -9.05)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: None, reward: 2.64965971881
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.649659718813634, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.65)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: None, reward: 1.63311708518
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.6331170851841899, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.63)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: None, reward: 1.09870510798
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 26, 't': 4, 'action': None, 'reward': 1.0987051079776855, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.10)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: left, reward: 2.76308963922
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 25, 't': 5, 'action': 'left', 'reward': 2.7630896392177364, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.76)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: 2.28432836889
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 2.2843283688933473, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.28)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: 2.43211273098
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 7, 'action': None, 'reward': 2.4321127309778405, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.43)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: 1.29093250825
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 8, 'action': None, 'reward': 1.2909325082500414, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.29)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: 2.49989911177
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 21, 't': 9, 'action': None, 'reward': 2.4998991117652496, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.50)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: forward, reward: 2.21054719088
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 2.210547190878809, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.21)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: right, reward: 1.81086537351
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 1.8108653735092863, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.81)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 4), heading: (0, 1), action: forward, reward: 2.29216324163
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 12, 'action': 'forward', 'reward': 2.2921632416299556, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.29)
57% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 194
\-------------------------
Environment.reset(): Trial set up with start = (7, 6), destination = (3, 4), deadline = 30
Simulating trial. . .
epsilon = 0.1451; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: right, reward: 2.16300513748
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 2.1630051374804107, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.16)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: right, reward: 1.51696321467
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 29, 't': 1, 'action': 'right', 'reward': 1.5169632146738619, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.52)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: right, reward: 0.801252921867
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 28, 't': 2, 'action': 'right', 'reward': 0.8012529218666603, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent drove right instead of forward. (rewarded 0.80)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 1.18393180757
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.1839318075685552, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.18)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: left, reward: 1.440635688
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 26, 't': 4, 'action': 'left', 'reward': 1.4406356880033848, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.44)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: 2.43581119742
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 2.435811197424608, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.44)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 1.80938654247
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 24, 't': 6, 'action': None, 'reward': 1.8093865424715239, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.81)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 1.27395744445
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 23, 't': 7, 'action': None, 'reward': 1.2739574444525021, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.27)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 1.25931679976
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 22, 't': 8, 'action': None, 'reward': 1.259316799761925, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.26)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: 1.70650799481
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 1.7065079948120176, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.71)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 5), heading: (0, -1), action: left, reward: 2.54216288277
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 10, 'action': 'left', 'reward': 2.542162882766143, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.54)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: forward, reward: 0.983831575244
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 19, 't': 11, 'action': 'forward', 'reward': 0.9838315752436362, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 0.98)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 195
\-------------------------
Environment.reset(): Trial set up with start = (1, 4), destination = (6, 5), deadline = 20
Simulating trial. . .
epsilon = 0.1437; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: left, reward: 1.74931800654
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.7493180065423541, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent drove left instead of forward. (rewarded 1.75)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: right, reward: 1.78647365989
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.7864736598945339, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.79)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: None, reward: -5.72279714007
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': -5.722797140066303, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.72)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 2.37392180299
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.3739218029926326, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.37)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: None, reward: -5.38730842579
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 16, 't': 4, 'action': None, 'reward': -5.387308425785859, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -5.39)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: forward, reward: 1.03902261239
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.039022612392431, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.04)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 196
\-------------------------
Environment.reset(): Trial set up with start = (3, 4), destination = (7, 2), deadline = 30
Simulating trial. . .
epsilon = 0.1423; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 1.55440797281
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 30, 't': 0, 'action': None, 'reward': 1.5544079728074958, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.55)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 1.16456477956
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.1645647795568626, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.16)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 1.61281464963
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.612814649631783, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.61)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 1.21907074604
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.2190707460399104, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.22)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 2.92714416765
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 26, 't': 4, 'action': None, 'reward': 2.927144167650102, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.93)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: left, reward: 1.47657468952
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 25, 't': 5, 'action': 'left', 'reward': 1.4765746895216265, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.48)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: forward, reward: 0.490967081713
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'right'), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 0.4909670817126187, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'right')
Agent drove forward instead of left. (rewarded 0.49)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: left, reward: 2.45077151456
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 23, 't': 7, 'action': 'left', 'reward': 2.4507715145612967, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.45)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 1.22174028199
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 22, 't': 8, 'action': None, 'reward': 1.2217402819906238, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.22)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: right, reward: 0.134933519001
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 21, 't': 9, 'action': 'right', 'reward': 0.1349335190005163, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent drove right instead of forward. (rewarded 0.13)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: None, reward: 1.09191321514
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 20, 't': 10, 'action': None, 'reward': 1.0919132151351931, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.09)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: None, reward: 1.93803066974
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 11, 'action': None, 'reward': 1.9380306697409602, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.94)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: None, reward: 2.41178739147
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 12, 'action': None, 'reward': 2.411787391469752, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.41)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: left, reward: 2.14780415229
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 17, 't': 13, 'action': 'left', 'reward': 2.1478041522867843, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.15)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: 2.5508687853
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 16, 't': 14, 'action': None, 'reward': 2.5508687853041425, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 2.55)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: forward, reward: 1.71052431703
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 15, 'action': 'forward', 'reward': 1.710524317034735, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.71)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 1.08183658589
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 16, 'action': None, 'reward': 1.0818365858866665, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.08)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 1.55720916816
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 17, 'action': None, 'reward': 1.5572091681565274, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.56)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 2.14989521429
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 18, 'action': None, 'reward': 2.1498952142876995, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.15)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 1.63537604523
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 19, 'action': 'forward', 'reward': 1.6353760452314325, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.64)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.32501288817
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 10, 't': 20, 'action': None, 'reward': 2.3250128881723295, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.33)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.21751340975
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'left'), 'deadline': 9, 't': 21, 'action': None, 'reward': 2.2175134097514624, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'left')
Agent properly idled at a red light. (rewarded 2.22)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: right, reward: 1.14138297166
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 8, 't': 22, 'action': 'right', 'reward': 1.1413829716605077, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.14)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: right, reward: 1.08768811278
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 7, 't': 23, 'action': 'right', 'reward': 1.0876881127774085, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent followed the waypoint right. (rewarded 1.09)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: right, reward: 0.996632698476
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 6, 't': 24, 'action': 'right', 'reward': 0.9966326984756069, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.00)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: right, reward: -19.5976246953
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 5, 't': 25, 'action': 'right', 'reward': -19.59762469529078, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.60)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: right, reward: 1.79336867644
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 4, 't': 26, 'action': 'right', 'reward': 1.7933686764440455, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.79)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 0.806850785178
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 3, 't': 27, 'action': None, 'reward': 0.806850785178137, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.81)
7% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: left, reward: 1.38370203135
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 2, 't': 28, 'action': 'left', 'reward': 1.3837020313475101, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.38)
3% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 197
\-------------------------
Environment.reset(): Trial set up with start = (3, 5), destination = (6, 7), deadline = 25
Simulating trial. . .
epsilon = 0.1409; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: left, reward: 2.95030047909
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 2.950300479086243, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.95)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: forward, reward: 2.92054191096
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 2.920541910956463, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.92)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: forward, reward: 2.30528363009
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': 2.305283630094346, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.31)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: right, reward: 2.04807360198
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 22, 't': 3, 'action': 'right', 'reward': 2.0480736019827934, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.05)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: forward, reward: 2.51122648009
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.511226480088984, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.51)
80% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 198
\-------------------------
Environment.reset(): Trial set up with start = (4, 7), destination = (2, 4), deadline = 25
Simulating trial. . .
epsilon = 0.1395; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: right, reward: 2.63315388661
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.6331538866097515, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.63)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: right, reward: 1.84327234377
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.8432723437737437, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent followed the waypoint right. (rewarded 1.84)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: forward, reward: 2.24625663462
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': 2.2462566346153015, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.25)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: left, reward: -10.997100015
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 22, 't': 3, 'action': 'left', 'reward': -10.99710001496495, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -11.00)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 1.85256578447
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.8525657844655352, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.85)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: left, reward: 1.40367822931
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 1.403678229310059, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.40)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: forward, reward: 1.223561383
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 1.2235613830031422, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.22)
72% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 199
\-------------------------
Environment.reset(): Trial set up with start = (5, 6), destination = (1, 4), deadline = 30
Simulating trial. . .
epsilon = 0.1381; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 2.99203763862
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 30, 't': 0, 'action': None, 'reward': 2.9920376386234757, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.99)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 2.8889172138
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.888917213799227, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.89)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 2.6285978906
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.6285978905958105, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.63)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 2.39463740856
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 2.394637408564196, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.39)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 1.57297891707
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 26, 't': 4, 'action': None, 'reward': 1.5729789170705593, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.57)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 0.990912907419
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 25, 't': 5, 'action': None, 'reward': 0.9909129074194896, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.99)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: forward, reward: 1.11708482222
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.1170848222219474, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.12)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: None, reward: 1.88147409285
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 23, 't': 7, 'action': None, 'reward': 1.881474092850664, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.88)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: None, reward: 2.17872652666
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.178726526660792, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.18)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: None, reward: 1.95671455217
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 21, 't': 9, 'action': None, 'reward': 1.9567145521652762, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.96)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: forward, reward: 1.37182655837
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 1.3718265583654363, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.37)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: right, reward: -19.4131836794
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 19, 't': 11, 'action': 'right', 'reward': -19.41318367942647, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.41)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: None, reward: 1.67881560491
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 18, 't': 12, 'action': None, 'reward': 1.6788156049101077, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.68)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: None, reward: 1.71345523236
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 13, 'action': None, 'reward': 1.713455232360603, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.71)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: left, reward: -9.30978688905
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 14, 'action': 'left', 'reward': -9.309786889053182, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.31)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: left, reward: -10.6497891236
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 15, 'action': 'left', 'reward': -10.649789123623721, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.65)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: forward, reward: 1.77841718986
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 16, 'action': 'forward', 'reward': 1.7784171898575125, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.78)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 2.10465392929
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 13, 't': 17, 'action': None, 'reward': 2.1046539292906514, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.10)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 1.81524789415
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 12, 't': 18, 'action': None, 'reward': 1.815247894146428, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.82)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: forward, reward: 1.59846929086
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 19, 'action': 'forward', 'reward': 1.5984692908553662, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.60)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 1.62629350977
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 10, 't': 20, 'action': None, 'reward': 1.6262935097713753, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.63)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 0.926145912995
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 9, 't': 21, 'action': None, 'reward': 0.9261459129954659, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.93)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: left, reward: 0.965097873499
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 8, 't': 22, 'action': 'left', 'reward': 0.9650978734992839, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 0.97)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 2.41728773459
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 7, 't': 23, 'action': None, 'reward': 2.417287734586241, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.42)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 1.31617868517
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 6, 't': 24, 'action': None, 'reward': 1.3161786851747002, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.32)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: forward, reward: 1.41816034531
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 5, 't': 25, 'action': 'forward', 'reward': 1.4181603453114846, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.42)
13% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 200
\-------------------------
Environment.reset(): Trial set up with start = (8, 7), destination = (5, 3), deadline = 25
Simulating trial. . .
epsilon = 0.1367; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: right, reward: 1.02060265194
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'right'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.0206026519394793, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'right')
Agent drove right instead of forward. (rewarded 1.02)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: None, reward: 2.5293192714
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.529319271397033, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.53)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: right, reward: 0.156852916928
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 0.15685291692781522, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent drove right instead of left. (rewarded 0.16)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 2.67306117793
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': 'right', 'reward': 2.673061177927379, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.67)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 1.5828046235
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 1.5828046235004882, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.58)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 1.16072841192
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.160728411916972, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.16)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 2.47610379588
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.476103795876644, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.48)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 2.51874189036
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.518741890357834, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.52)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.57963895917
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.579638959174458, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.58)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: right, reward: 0.0237918328634
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 0.023791832863391038, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 0.02)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: left, reward: 2.34096881186
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 15, 't': 10, 'action': 'left', 'reward': 2.3409688118639562, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.34)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: forward, reward: 2.60188384757
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 2.601883847568418, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.60)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: left, reward: 2.75145408081
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 13, 't': 12, 'action': 'left', 'reward': 2.7514540808131382, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.75)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: forward, reward: 2.58712747103
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 2.587127471032179, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.59)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: forward, reward: 0.73920023981
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 11, 't': 14, 'action': 'forward', 'reward': 0.739200239810208, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 0.74)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 201
\-------------------------
Environment.reset(): Trial set up with start = (2, 7), destination = (4, 3), deadline = 20
Simulating trial. . .
epsilon = 0.1353; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 2.05206037435
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.0520603743504378, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.05)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 2.22091556005
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.220915560053758, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.22)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 1.12591808274
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.1259180827358493, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.13)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 1.32735247965
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.3273524796522609, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.33)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: forward, reward: 1.36555360329
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.3655536032914637, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.37)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 2.26783125936
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.26783125936111, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.27)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 1.17072223826
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.1707222382590636, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.17)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: forward, reward: 2.01503752564
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.0150375256425277, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.02)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: forward, reward: 0.523305350471
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'right'), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 0.5233053504709301, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'right')
Agent drove forward instead of right. (rewarded 0.52)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: None, reward: 1.72063465506
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.7206346550603189, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.72)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: right, reward: 1.45296846507
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 1.452968465065547, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.45)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: right, reward: 1.32300127067
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 1.323001270665123, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.32)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: left, reward: 1.97672803556
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 8, 't': 12, 'action': 'left', 'reward': 1.9767280355580068, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.98)
35% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 202
\-------------------------
Environment.reset(): Trial set up with start = (8, 5), destination = (4, 3), deadline = 30
Simulating trial. . .
epsilon = 0.1340; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: right, reward: 2.5077698336
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 2.5077698336000243, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.51)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: forward, reward: 1.3892949481
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 29, 't': 1, 'action': 'forward', 'reward': 1.3892949480973442, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.39)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: 2.61495381215
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.614953812154705, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.61)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: 1.57549330066
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.575493300657295, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.58)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: forward, reward: 2.60361008413
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 2.603610084134777, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.60)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 1.93590296254
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 25, 't': 5, 'action': None, 'reward': 1.9359029625371784, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.94)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 2.62849906672
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.628499066715009, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.63)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: 2.54442594823
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 2.5444259482303546, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.54)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 2.65556921982
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.655569219816978, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.66)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: -4.0464736284
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 21, 't': 9, 'action': None, 'reward': -4.046473628402141, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.05)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 4), heading: (0, -1), action: left, reward: 2.0364218657
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 10, 'action': 'left', 'reward': 2.0364218657009143, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.04)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 3), heading: (0, -1), action: forward, reward: 2.01326214867
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 11, 'action': 'forward', 'reward': 2.013262148668483, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.01)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 203
\-------------------------
Environment.reset(): Trial set up with start = (8, 2), destination = (4, 7), deadline = 25
Simulating trial. . .
epsilon = 0.1327; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: None, reward: 1.55082328861
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.5508232886123896, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.55)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: None, reward: 1.35576862529
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.3557686252893073, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.36)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: None, reward: 1.3456648886
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.3456648886039781, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.35)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: None, reward: 2.74374405375
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.7437440537482205, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.74)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: left, reward: 1.66440658143
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 1.6644065814327205, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.66)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 1.1570363271
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.1570363270978097, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.16)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 1.14873709456
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.148737094564134, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.15)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: right, reward: 0.883767750034
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 0.8837677500342322, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent drove right instead of forward. (rewarded 0.88)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: left, reward: 2.34353317206
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 17, 't': 8, 'action': 'left', 'reward': 2.343533172056601, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.34)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: -9.33089848205
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': -9.33089848204731, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.33)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 1.83134898022
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 1.831348980217062, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent drove right instead of forward. (rewarded 1.83)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: left, reward: 2.54627325384
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 14, 't': 11, 'action': 'left', 'reward': 2.5462732538434083, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.55)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: forward, reward: 2.52513972395
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 13, 't': 12, 'action': 'forward', 'reward': 2.525139723945446, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.53)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: left, reward: -9.80362007973
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 12, 't': 13, 'action': 'left', 'reward': -9.803620079734507, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.80)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: None, reward: 2.11790287278
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 11, 't': 14, 'action': None, 'reward': 2.1179028727778246, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.12)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: forward, reward: -9.37060826082
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 10, 't': 15, 'action': 'forward', 'reward': -9.370608260815438, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.37)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: right, reward: 0.281349642857
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 9, 't': 16, 'action': 'right', 'reward': 0.281349642857195, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.28)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: None, reward: 2.20224813155
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 8, 't': 17, 'action': None, 'reward': 2.2022481315513143, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.20)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: None, reward: 1.15479375368
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 7, 't': 18, 'action': None, 'reward': 1.1547937536810826, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.15)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: None, reward: 1.90513879302
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 6, 't': 19, 'action': None, 'reward': 1.9051387930212238, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.91)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: None, reward: 1.39021839579
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 5, 't': 20, 'action': None, 'reward': 1.3902183957949785, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.39)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: forward, reward: 0.59399816388
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 4, 't': 21, 'action': 'forward', 'reward': 0.5939981638803025, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.59)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: left, reward: 0.138359280549
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 3, 't': 22, 'action': 'left', 'reward': 0.13835928054870128, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded 0.14)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: right, reward: 1.72278326051
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 2, 't': 23, 'action': 'right', 'reward': 1.7227832605068372, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.72)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: forward, reward: 0.903292571582
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 1, 't': 24, 'action': 'forward', 'reward': 0.9032925715823379, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'green', None, 'forward')
Agent drove forward instead of right. (rewarded 0.90)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 204
\-------------------------
Environment.reset(): Trial set up with start = (8, 5), destination = (5, 6), deadline = 20
Simulating trial. . .
epsilon = 0.1313; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 2.21593343231
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 2.21593343230822, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.22)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: forward, reward: 2.80418573347
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 2.8041857334743265, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 2.80)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: forward, reward: 1.738229312
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 1.7382293119971968, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.74)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: right, reward: 1.64069002548
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.640690025479759, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.64)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: right, reward: 2.09199272603
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 2.0919927260319353, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.09)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 5), heading: (0, 1), action: right, reward: 1.18812869703
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 1.1881286970292184, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.19)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: right, reward: 2.59805268077
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 2.598052680767487, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.60)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: left, reward: 2.75517903646
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 2.7551790364554085, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.76)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 205
\-------------------------
Environment.reset(): Trial set up with start = (6, 5), destination = (4, 2), deadline = 25
Simulating trial. . .
epsilon = 0.1300; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: right, reward: 2.48553186394
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.485531863944316, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.49)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: right, reward: 1.4730023134
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.473002313404425, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.47)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 2.91220359151
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.9122035915083213, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.91)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 2.640116885
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.6401168850033887, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.64)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 2.34056019557
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.340560195574093, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.34)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 2.45667128933
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.4566712893259, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.46)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: forward, reward: 2.53673714263
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 2.5367371426334886, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.54)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: left, reward: 1.69177385868
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 1.6917738586820967, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.69)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: forward, reward: 2.26692800934
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 2.2669280093434927, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.27)
64% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 206
\-------------------------
Environment.reset(): Trial set up with start = (8, 7), destination = (4, 5), deadline = 30
Simulating trial. . .
epsilon = 0.1287; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: left, reward: 1.82742909029
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 30, 't': 0, 'action': 'left', 'reward': 1.8274290902935297, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 1.83)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 2.7154997648
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.7154997647973884, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.72)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 2.36859057745
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.3685905774498215, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.37)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: right, reward: 0.587922176531
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 27, 't': 3, 'action': 'right', 'reward': 0.5879221765305754, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 0.59)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: left, reward: 2.4156869078
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 26, 't': 4, 'action': 'left', 'reward': 2.4156869078016987, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 2.42)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: forward, reward: 1.49044173353
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 1.4904417335304159, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.49)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 2.30041851739
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.3004185173922522, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.30)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 1.18167420702
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 7, 'action': None, 'reward': 1.1816742070218758, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.18)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 1.17562213087
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 8, 'action': None, 'reward': 1.1756221308672514, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.18)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: forward, reward: 2.69781940464
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 2.697819404642793, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.70)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: None, reward: 1.43489007674
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 10, 'action': None, 'reward': 1.4348900767388877, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.43)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: left, reward: 1.57960520418
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 19, 't': 11, 'action': 'left', 'reward': 1.5796052041834454, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.58)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: None, reward: 2.48352290533
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 12, 'action': None, 'reward': 2.4835229053323884, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.48)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: forward, reward: 1.99090180119
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 1.9909018011918749, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.99)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: 2.48183597706
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 16, 't': 14, 'action': None, 'reward': 2.4818359770603875, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.48)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: 2.68720055497
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 15, 't': 15, 'action': None, 'reward': 2.6872005549730793, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.69)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: forward, reward: 2.07488637297
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 16, 'action': 'forward', 'reward': 2.074886372974939, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.07)
43% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 207
\-------------------------
Environment.reset(): Trial set up with start = (1, 2), destination = (6, 5), deadline = 30
Simulating trial. . .
epsilon = 0.1275; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 1.40331304169
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 30, 't': 0, 'action': 'forward', 'reward': 1.4033130416898767, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.40)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 2.9524718363
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.952471836297872, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.95)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 1.04359935064
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.0435993506372023, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.04)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 2.17528938954
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 27, 't': 3, 'action': None, 'reward': 2.175289389543344, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.18)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: 2.63223165991
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 2.632231659912865, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.63)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: forward, reward: 2.73182128773
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 2.7318212877270467, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.73)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: right, reward: 1.09842503705
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 24, 't': 6, 'action': 'right', 'reward': 1.0984250370464954, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.10)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: forward, reward: 2.11198621702
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 2.1119862170213732, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.11)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: right, reward: 0.40184252755
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 22, 't': 8, 'action': 'right', 'reward': 0.40184252755025385, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 0.40)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: left, reward: 2.18154601301
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 21, 't': 9, 'action': 'left', 'reward': 2.1815460130113706, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.18)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 2.82811811299
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 20, 't': 10, 'action': None, 'reward': 2.8281181129938324, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.83)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: left, reward: 2.18576306723
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 19, 't': 11, 'action': 'left', 'reward': 2.185763067233446, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 2.19)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 208
\-------------------------
Environment.reset(): Trial set up with start = (2, 5), destination = (7, 4), deadline = 20
Simulating trial. . .
epsilon = 0.1262; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: forward, reward: 1.96513796478
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'left'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.9651379647782374, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'left')
Agent drove forward instead of left. (rewarded 1.97)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: left, reward: -10.1336989056
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': -10.133698905646078, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -10.13)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 2.4047784674
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.40477846739865, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.40)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: left, reward: -39.4641916297
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 17, 't': 3, 'action': 'left', 'reward': -39.46419162973478, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.46)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: left, reward: 2.48438248598
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.4843824859806247, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.48)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: forward, reward: -10.6177983427
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': -10.617798342680059, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.62)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: None, reward: 1.41668275725
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.4166827572480607, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.42)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: None, reward: 1.34639688832
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.3463968883236597, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.35)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: None, reward: 1.80644106732
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.8064410673237379, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.81)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: left, reward: 1.80260086539
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'left', 'reward': 1.8026008653913743, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.80)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: None, reward: 0.866801148581
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 10, 't': 10, 'action': None, 'reward': 0.866801148580981, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 0.87)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: None, reward: 0.836637185031
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 9, 't': 11, 'action': None, 'reward': 0.8366371850312346, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 0.84)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: None, reward: 2.10968725454
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 8, 't': 12, 'action': None, 'reward': 2.1096872545360896, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.11)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: forward, reward: 1.79371303741
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 1.7937130374053576, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.79)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: 1.15046445954
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.1504644595423672, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 1.15)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: forward, reward: 2.02186025291
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': 2.0218602529082927, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.02)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: None, reward: 2.01923767617
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 4, 't': 16, 'action': None, 'reward': 2.01923767616701, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.02)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: None, reward: 1.63870338862
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 3, 't': 17, 'action': None, 'reward': 1.638703388622141, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.64)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: None, reward: 1.12318388319
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 2, 't': 18, 'action': None, 'reward': 1.123183883185026, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.12)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: forward, reward: 0.59434591955
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 1, 't': 19, 'action': 'forward', 'reward': 0.594345919549917, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 0.59)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 209
\-------------------------
Environment.reset(): Trial set up with start = (6, 6), destination = (1, 5), deadline = 20
Simulating trial. . .
epsilon = 0.1249; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: None, reward: 1.03344842268
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.0334484226839995, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.03)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: None, reward: 1.25450624597
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.254506245974091, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.25)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: left, reward: -9.37224984158
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': -9.37224984158239, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.37)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: None, reward: 1.18461427011
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.1846142701060016, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.18)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: left, reward: 1.93579063562
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 1.9357906356193855, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.94)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: forward, reward: 2.10210561834
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.102105618336952, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.10)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: forward, reward: 2.11624757634
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 2.1162475763366104, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.12)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 0.395042178277
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 0.39504217827718247, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.40)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 1.36811775869
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.3681177586924689, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.37)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 0.4071741506
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 11, 't': 9, 'action': None, 'reward': 0.40717415059975925, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 0.41)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 0.393844745012
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'right'), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 0.39384474501159283, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'right')
Agent drove forward instead of right. (rewarded 0.39)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: right, reward: 1.74969682831
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 1.7496968283082224, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.75)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: right, reward: 2.31807603162
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 2.318076031622005, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.32)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 2.03535842917
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 7, 't': 13, 'action': None, 'reward': 2.0353584291744324, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.04)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 2.18808318193
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 6, 't': 14, 'action': None, 'reward': 2.1880831819313, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.19)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: right, reward: -0.0898642848974
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'left'), 'deadline': 5, 't': 15, 'action': 'right', 'reward': -0.08986428489738518, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'left')
Agent drove right instead of forward. (rewarded -0.09)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: None, reward: 1.20285503837
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 4, 't': 16, 'action': None, 'reward': 1.2028550383687786, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.20)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: None, reward: 1.6215172001
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 3, 't': 17, 'action': None, 'reward': 1.6215172001020157, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.62)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: None, reward: 0.710454040385
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 2, 't': 18, 'action': None, 'reward': 0.7104540403850388, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.71)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: right, reward: 0.672394075312
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 1, 't': 19, 'action': 'right', 'reward': 0.67239407531205, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.67)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 210
\-------------------------
Environment.reset(): Trial set up with start = (5, 3), destination = (3, 7), deadline = 20
Simulating trial. . .
epsilon = 0.1237; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: right, reward: 1.29812552548
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.2981255254821513, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.30)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: left, reward: 0.455154424286
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': 0.45515442428563835, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove left instead of forward. (rewarded 0.46)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: right, reward: 2.47271264872
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.472712648720357, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.47)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: right, reward: 2.89193278507
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 2.891932785074898, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.89)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: forward, reward: 1.6655123168
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.6655123167973616, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.67)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 7), heading: (0, -1), action: forward, reward: 2.84567477932
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.8456747793237493, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.85)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 211
\-------------------------
Environment.reset(): Trial set up with start = (2, 5), destination = (6, 6), deadline = 25
Simulating trial. . .
epsilon = 0.1225; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 2.52772920576
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 25, 't': 0, 'action': None, 'reward': 2.5277292057628795, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.53)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 1.46911202787
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.4691120278684402, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.47)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 0.992152270734
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 23, 't': 2, 'action': None, 'reward': 0.9921522707340131, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 0.99)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 2.22691637141
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.226916371407399, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.23)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: forward, reward: 2.86947224771
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.869472247714589, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.87)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 2.50162490191
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.5016249019076726, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.50)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 2.16789377527
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'right'), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.167893775266697, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 2.17)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 2.10714171719
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.107141717189275, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.11)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: None, reward: 1.55810138626
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.5581013862593915, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.56)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 1.383842171
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 1.3838421709992643, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 1.38)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: -10.5486347005
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': -10.54863470052054, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.55)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 4), heading: (0, -1), action: right, reward: 1.76853182614
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 1.768531826140395, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 1.77)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: left, reward: 1.12326136764
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 13, 't': 12, 'action': 'left', 'reward': 1.1232613676433663, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 1.12)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 1.03056694064
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 12, 't': 13, 'action': None, 'reward': 1.0305669406388587, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.03)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 2.5766873215
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 11, 't': 14, 'action': None, 'reward': 2.576687321503102, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.58)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (6, 5), heading: (0, 1), action: left, reward: 1.96494494355
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 10, 't': 15, 'action': 'left', 'reward': 1.9649449435466408, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.96)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: forward, reward: 1.78254907894
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 9, 't': 16, 'action': 'forward', 'reward': 1.7825490789418865, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.78)
32% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 212
\-------------------------
Environment.reset(): Trial set up with start = (8, 7), destination = (2, 3), deadline = 20
Simulating trial. . .
epsilon = 0.1212; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 2.08086890924
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.080868909240534, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.08)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 2.58386940634
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.5838694063362926, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.58)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 1.98642760544
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.9864276054359766, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.99)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 2.19461923871
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.194619238708726, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.19)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: 1.05037383423
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.050373834228544, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.05)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: forward, reward: 2.62003867586
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.6200386758620335, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.62)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: right, reward: 2.60066985792
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 2.6006698579164103, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.60)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 0.996879209586
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': 0.9968792095860328, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.00)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 2.51272447294
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.5127244729445004, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.51)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: forward, reward: 1.03866842918
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 1.038668429183259, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.04)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 213
\-------------------------
Environment.reset(): Trial set up with start = (6, 2), destination = (3, 3), deadline = 20
Simulating trial. . .
epsilon = 0.1200; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: None, reward: 1.47393006065
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.4739300606507189, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.47)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: None, reward: 2.83437845261
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.8343784526137012, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.83)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: None, reward: 1.4856329945
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.4856329944987963, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.49)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: forward, reward: 2.87308206629
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.873082066290128, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.87)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: forward, reward: 2.47208036115
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.4720803611509963, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.47)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: forward, reward: 2.68613167698
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.6861316769757853, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.69)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: left, reward: 1.51559278744
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 1.5155927874369866, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.52)
65% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 214
\-------------------------
Environment.reset(): Trial set up with start = (7, 7), destination = (2, 5), deadline = 25
Simulating trial. . .
epsilon = 0.1188; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: left, reward: 2.27878299678
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 2.2787829967821613, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.28)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: 1.65405819116
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 1.654058191161648, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.65)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 2.34856333304
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.348563333036715, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.35)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: forward, reward: 1.48281740589
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.4828174058878913, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.48)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: left, reward: 1.67070507482
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 1.670705074820733, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 1.67)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: None, reward: 2.46905820049
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.4690582004949606, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.47)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: None, reward: 1.33768410407
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.3376841040725878, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.34)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: None, reward: 2.46914573897
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 18, 't': 7, 'action': None, 'reward': 2.469145738965753, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.47)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: None, reward: 1.60920113316
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.6092011331588758, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.61)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: left, reward: 1.46314111757
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 16, 't': 9, 'action': 'left', 'reward': 1.4631411175651445, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 1.46)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: left, reward: 0.451165343518
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 15, 't': 10, 'action': 'left', 'reward': 0.4511653435177573, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 0.45)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: 1.96539982364
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 14, 't': 11, 'action': None, 'reward': 1.9653998236427437, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.97)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: 0.810334180746
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 13, 't': 12, 'action': None, 'reward': 0.8103341807458797, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.81)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: -5.0603001066
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 12, 't': 13, 'action': None, 'reward': -5.060300106596193, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.06)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 1.70620323411
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 11, 't': 14, 'action': 'right', 'reward': 1.7062032341096485, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.71)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: right, reward: 0.815269087139
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 10, 't': 15, 'action': 'right', 'reward': 0.8152690871391997, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 0.82)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: right, reward: 0.906770175482
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 9, 't': 16, 'action': 'right', 'reward': 0.9067701754817921, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 0.91)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: left, reward: 1.5212488786
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 8, 't': 17, 'action': 'left', 'reward': 1.5212488785988951, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 1.52)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 0.760294285619
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 7, 't': 18, 'action': None, 'reward': 0.7602942856187351, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 0.76)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: right, reward: 1.73473903794
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 6, 't': 19, 'action': 'right', 'reward': 1.7347390379437984, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.73)
20% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 215
\-------------------------
Environment.reset(): Trial set up with start = (4, 6), destination = (8, 6), deadline = 20
Simulating trial. . .
epsilon = 0.1177; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: right, reward: 1.2417635978
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.2417635977961456, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'left')
Agent followed the waypoint right. (rewarded 1.24)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: forward, reward: 2.60710441387
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 2.6071044138707773, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.61)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 2.15806594888
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.158065948883923, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.16)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 0.96309494829
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 0.963094948289724, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.96)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: forward, reward: -10.0056782077
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': -10.00567820772828, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -10.01)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: forward, reward: 1.85044393185
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.8504439318527488, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 1.85)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: 2.80670203076
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 2.806702030763355, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.81)
65% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 216
\-------------------------
Environment.reset(): Trial set up with start = (3, 7), destination = (6, 3), deadline = 25
Simulating trial. . .
epsilon = 0.1165; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: left, reward: -39.7362150416
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 25, 't': 0, 'action': 'left', 'reward': -39.736215041591485, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.74)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: None, reward: 2.29415878494
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.2941587849364256, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.29)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: None, reward: 2.47848094094
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.4784809409368505, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.48)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: left, reward: 1.43333830635
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 22, 't': 3, 'action': 'left', 'reward': 1.433338306348007, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.43)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: left, reward: 2.60583395251
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 2.6058339525094905, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.61)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: None, reward: 1.66964295924
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.6696429592376, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.67)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: None, reward: 2.47000837548
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.470008375479879, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.47)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: 1.16440109029
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.1644010902894404, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.16)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: 0.909307997775
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 0.9093079977747323, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.91)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: right, reward: -19.6467085353
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 16, 't': 9, 'action': 'right', 'reward': -19.64670853531592, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.65)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: forward, reward: 2.60208223005
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 2.6020822300480173, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.60)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: right, reward: -19.7670235706
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 14, 't': 11, 'action': 'right', 'reward': -19.767023570583973, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.77)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: right, reward: 0.79150745663
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'left'), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 0.7915074566299876, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'left')
Agent followed the waypoint right. (rewarded 0.79)
48% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 217
\-------------------------
Environment.reset(): Trial set up with start = (2, 4), destination = (5, 5), deadline = 20
Simulating trial. . .
epsilon = 0.1153; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: right, reward: 1.04627933117
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'right'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.0462793311737817, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'right')
Agent drove right instead of left. (rewarded 1.05)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: right, reward: 1.15938279149
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.1593827914883046, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove right instead of forward. (rewarded 1.16)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: left, reward: 2.23279290535
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': 2.2327929053459794, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.23)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: forward, reward: 1.80101010073
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.8010101007337596, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.80)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: None, reward: 1.17106770574
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.1710677057397445, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.17)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: forward, reward: 1.10996084042
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.1099608404238095, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.11)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: None, reward: 1.50152384798
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.5015238479843243, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.50)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: forward, reward: 1.01461762018
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.0146176201841142, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.01)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: left, reward: 2.04592009791
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 12, 't': 8, 'action': 'left', 'reward': 2.0459200979146903, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.05)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: forward, reward: 2.61446176387
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 2.6144617638734178, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.61)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 218
\-------------------------
Environment.reset(): Trial set up with start = (3, 7), destination = (6, 2), deadline = 20
Simulating trial. . .
epsilon = 0.1142; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: None, reward: 2.40206070858
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.402060708578043, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.40)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: None, reward: 1.53403289926
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.5340328992567152, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.53)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: None, reward: 2.36444632162
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.364446321622029, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.36)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: right, reward: 0.986544567858
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 0.9865445678575867, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.99)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: right, reward: 1.36251397762
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 1.362513977615682, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.36)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: None, reward: 2.60048999494
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.6004899949361855, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.60)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: None, reward: 2.87610317574
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.876103175741517, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.88)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: forward, reward: 1.5051746411
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.5051746411029343, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.51)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 2.73882072702
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.7388207270230573, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.74)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 1.79249633306
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.792496333062719, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.79)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 1.03618910006
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.036189100057136, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.04)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: forward, reward: 1.23267345197
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 1.2326734519696567, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.23)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: right, reward: 1.29854784173
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 1.2985478417275003, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.30)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: left, reward: -39.7353221911
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 7, 't': 13, 'action': 'left', 'reward': -39.735322191079725, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.74)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: None, reward: 1.13720919189
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.1372091918946607, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.14)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: forward, reward: 0.976309842138
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': 0.9763098421378915, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.98)
20% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 219
\-------------------------
Environment.reset(): Trial set up with start = (3, 5), destination = (1, 3), deadline = 20
Simulating trial. . .
epsilon = 0.1130; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: forward, reward: 2.80406806353
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 2.80406806353088, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.80)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 1.63927262301
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.6392726230083998, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.64)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 1.6172174737
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.617217473701473, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.62)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 1.49360457365
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.493604573653925, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.49)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: forward, reward: 2.79559679754
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.795596797543099, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.80)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 1.2022658683
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.2022658683041518, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.20)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: right, reward: 1.73152676095
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.7315267609538205, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.73)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: None, reward: 1.22104537833
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.2210453783300657, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.22)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: None, reward: 0.846365845035
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 0.846365845034786, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.85)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: forward, reward: 1.94328468944
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 1.943284689442939, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.94)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 220
\-------------------------
Environment.reset(): Trial set up with start = (2, 7), destination = (7, 4), deadline = 30
Simulating trial. . .
epsilon = 0.1119; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: None, reward: 1.24399639139
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 30, 't': 0, 'action': None, 'reward': 1.2439963913860779, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.24)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: right, reward: 2.13246800313
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 29, 't': 1, 'action': 'right', 'reward': 2.132468003125947, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.13)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: forward, reward: 1.2643972443
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 28, 't': 2, 'action': 'forward', 'reward': 1.2643972442984712, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 1.26)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 1.64498087319
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.6449808731916264, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.64)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 2.73331876255
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 2.7333187625470696, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.73)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.80188102205
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 25, 't': 5, 'action': None, 'reward': 1.8018810220512647, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.80)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.62250910351
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 24, 't': 6, 'action': None, 'reward': 1.6225091035107366, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.62)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: right, reward: 0.975930143411
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'right', 'reward': 0.9759301434107185, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent drove right instead of left. (rewarded 0.98)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: forward, reward: 1.44056979648
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 8, 'action': 'forward', 'reward': 1.4405697964804127, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.44)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 4), heading: (0, -1), action: forward, reward: 1.63910117981
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 1.6391011798065418, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.64)
67% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 221
\-------------------------
Environment.reset(): Trial set up with start = (2, 2), destination = (5, 4), deadline = 25
Simulating trial. . .
epsilon = 0.1108; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: right, reward: 2.63150779912
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.6315077991222315, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent followed the waypoint right. (rewarded 2.63)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 1.75713968813
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.75713968812754, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.76)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 2.36088380547
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.3608838054713583, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.36)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 1.57732756233
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.5773275623314837, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.58)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 2.2531346552
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.253134655196124, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.25)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: forward, reward: 1.34827610402
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.3482761040186213, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.35)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: 1.09378829832
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 1.0937882983239784, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.09)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: right, reward: 0.942058192004
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 0.9420581920042013, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 0.94)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 1.01868657953
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.01868657953017, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.02)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 2.40787510183
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 16, 't': 9, 'action': None, 'reward': 2.4078751018292373, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.41)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 2.39943628952
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 15, 't': 10, 'action': None, 'reward': 2.3994362895194614, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.40)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: forward, reward: 1.08984205963
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 1.0898420596295597, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.09)
52% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 222
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (3, 3), deadline = 25
Simulating trial. . .
epsilon = 0.1097; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: right, reward: 2.72946343145
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.7294634314547723, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 2.73)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: right, reward: 1.76753782363
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.767537823630226, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 1.77)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: forward, reward: 1.05501587166
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': 1.0550158716604314, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.06)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: forward, reward: 2.25083583041
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.250835830413554, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.25)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 2.88154560664
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'left'), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.881545606643014, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'left')
Agent properly idled at a red light. (rewarded 2.88)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: left, reward: -19.016584977
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': -19.016584976971473, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.02)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: left, reward: 2.85491365067
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 19, 't': 6, 'action': 'left', 'reward': 2.8549136506704866, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.85)
72% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 223
\-------------------------
Environment.reset(): Trial set up with start = (6, 3), destination = (2, 3), deadline = 20
Simulating trial. . .
epsilon = 0.1086; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: None, reward: 2.64207104525
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.642071045254529, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.64)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: None, reward: 1.69966319043
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.6996631904283397, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.70)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: None, reward: 1.46061701389
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.4606170138877244, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.46)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: left, reward: 1.63125028049
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 1.6312502804921492, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 1.63)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: forward, reward: 1.49723856597
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.4972385659653338, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.50)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 1.82241681904
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.8224168190401233, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.82)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 2.55586399815
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.5558639981455924, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.56)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 2.46082495219
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 13, 't': 7, 'action': None, 'reward': 2.460824952186837, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.46)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 1.93935153389
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.9393515338855565, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.94)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: left, reward: 1.20380977795
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 11, 't': 9, 'action': 'left', 'reward': 1.2038097779459642, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 1.20)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: right, reward: 2.46518476912
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 2.465184769123119, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.47)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: forward, reward: 2.04417585795
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 2.0441758579466285, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.04)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: right, reward: 1.20414980468
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 1.2041498046847368, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.20)
35% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 224
\-------------------------
Environment.reset(): Trial set up with start = (1, 5), destination = (8, 2), deadline = 20
Simulating trial. . .
epsilon = 0.1075; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: right, reward: 1.71906217838
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.7190621783814932, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'left')
Agent followed the waypoint right. (rewarded 1.72)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: left, reward: 1.52286410153
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'left', 'reward': 1.5228641015291082, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.52)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: forward, reward: 1.48755014711
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 1.4875501471149006, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.49)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: forward, reward: 1.44194140032
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.4419414003217677, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.44)
80% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 225
\-------------------------
Environment.reset(): Trial set up with start = (3, 6), destination = (8, 5), deadline = 20
Simulating trial. . .
epsilon = 0.1065; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: None, reward: 1.08474482619
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.0847448261853836, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.08)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: None, reward: 1.87949484073
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.8794948407313976, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.88)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: None, reward: 1.58326852147
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.583268521474529, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.58)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: forward, reward: 2.20144577377
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.2014457737666233, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.20)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: left, reward: 1.704137127
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 1.7041371269955135, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent drove left instead of forward. (rewarded 1.70)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: right, reward: 2.8226549504
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 2.8226549504011107, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.82)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: 1.89286512037
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.892865120370407, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.89)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: forward, reward: 2.24558985296
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.245589852959024, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.25)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: left, reward: 0.843712508945
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 12, 't': 8, 'action': 'left', 'reward': 0.8437125089447511, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 0.84)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: right, reward: 1.01726886372
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 1.017268863720096, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.02)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: right, reward: 1.54857234535
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'forward'), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 1.5485723453544198, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'forward')
Agent followed the waypoint right. (rewarded 1.55)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: right, reward: 1.70787880376
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 1.7078788037596278, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 1.71)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: forward, reward: -10.0547452039
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': -10.054745203893132, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.05)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: left, reward: 1.76367702378
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 7, 't': 13, 'action': 'left', 'reward': 1.763677023780743, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 1.76)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: None, reward: 1.26298286316
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.2629828631646884, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.26)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: None, reward: 1.4728920677
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 5, 't': 15, 'action': None, 'reward': 1.472892067696908, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.47)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 5), heading: (0, -1), action: forward, reward: 1.6021380407
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 4, 't': 16, 'action': 'forward', 'reward': 1.6021380406998067, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.60)
15% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 226
\-------------------------
Environment.reset(): Trial set up with start = (4, 2), destination = (8, 6), deadline = 30
Simulating trial. . .
epsilon = 0.1054; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 1.15613886852
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 30, 't': 0, 'action': None, 'reward': 1.1561388685162055, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.16)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 2.43861518255
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.4386151825549796, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.44)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 1.12500951021
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.125009510207796, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.13)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: forward, reward: 1.43838212176
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 27, 't': 3, 'action': 'forward', 'reward': 1.4383821217618304, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.44)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 2.25460271489
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 26, 't': 4, 'action': None, 'reward': 2.254602714892208, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.25)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: forward, reward: 2.16526449075
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 2.1652644907535636, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.17)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 1.96716107389
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.9671610738935763, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.97)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 1.91565600288
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 23, 't': 7, 'action': None, 'reward': 1.9156560028759593, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 1.92)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 2.14833376748
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.1483337674773466, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.15)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 2.04518714016
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 2.0451871401589816, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.05)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: right, reward: 1.19781592701
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'left'), 'deadline': 20, 't': 10, 'action': 'right', 'reward': 1.1978159270064168, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'left')
Agent followed the waypoint right. (rewarded 1.20)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: None, reward: 2.61383051395
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'right'), 'deadline': 19, 't': 11, 'action': None, 'reward': 2.6138305139529723, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 2.61)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: None, reward: 2.00193315049
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 18, 't': 12, 'action': None, 'reward': 2.001933150489627, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.00)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: right, reward: 0.131744858854
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 17, 't': 13, 'action': 'right', 'reward': 0.13174485885357456, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 0.13)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 1.59730041231
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 14, 'action': None, 'reward': 1.5973004123057764, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.60)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: left, reward: 1.9556746168
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 15, 'action': 'left', 'reward': 1.955674616804318, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.96)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: forward, reward: 0.152791119548
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 14, 't': 16, 'action': 'forward', 'reward': 0.1527911195481908, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 0.15)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 1.08025505587
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 13, 't': 17, 'action': None, 'reward': 1.0802550558692565, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.08)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 1.00899963331
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 12, 't': 18, 'action': None, 'reward': 1.008999633314121, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.01)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: right, reward: 1.62534306796
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 11, 't': 19, 'action': 'right', 'reward': 1.6253430679630299, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.63)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: left, reward: -19.1583057643
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 10, 't': 20, 'action': 'left', 'reward': -19.15830576433359, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.16)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: right, reward: 0.979347572107
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 9, 't': 21, 'action': 'right', 'reward': 0.9793475721071772, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 0.98)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: right, reward: 1.9384421705
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 8, 't': 22, 'action': 'right', 'reward': 1.9384421705029986, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.94)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: 1.43070465634
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 7, 't': 23, 'action': None, 'reward': 1.4307046563406727, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.43)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: right, reward: 1.32946732211
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'right'), 'deadline': 6, 't': 24, 'action': 'right', 'reward': 1.3294673221061284, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'right')
Agent drove right instead of forward. (rewarded 1.33)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 2.2593790928
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 5, 't': 25, 'action': None, 'reward': 2.259379092804849, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.26)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: None, reward: 1.59355146941
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 4, 't': 26, 'action': None, 'reward': 1.5935514694079782, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.59)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: right, reward: 1.15063807394
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 3, 't': 27, 'action': 'right', 'reward': 1.1506380739398916, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.15)
7% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: right, reward: 1.30375793619
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 2, 't': 28, 'action': 'right', 'reward': 1.3037579361920508, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.30)
3% of time remaining to reach destination.
/-------------------
| Step 29 Results
\-------------------
Environment.step(): t = 29
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: right, reward: 0.387302740424
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 1, 't': 29, 'action': 'right', 'reward': 0.3873027404238998, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 0.39)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 227
\-------------------------
Environment.reset(): Trial set up with start = (2, 3), destination = (5, 7), deadline = 25
Simulating trial. . .
epsilon = 0.1044; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 1.46983588545
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 1.4698358854516553, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.47)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 2.60160401024
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.6016040102433062, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.60)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 1.7016420119
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.7016420118996134, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.70)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: 2.82653827627
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.8265382762732285, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.83)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: forward, reward: 0.999829618883
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 0.9998296188827098, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.00)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: left, reward: 2.7760051305
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 2.776005130499172, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.78)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: None, reward: 2.52939311476
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.5293931147586144, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.53)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: forward, reward: 0.917532254137
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 0.9175322541373094, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.92)
68% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 228
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (7, 4), deadline = 20
Simulating trial. . .
epsilon = 0.1033; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: right, reward: 0.292296988381
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 0.29229698838065676, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.29)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: right, reward: 1.23198314477
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.231983144768181, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.23)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: right, reward: 1.32877335969
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.3287733596856552, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.33)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: None, reward: 2.51144758267
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.511447582665485, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.51)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: forward, reward: 2.82792650815
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.827926508148906, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.83)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: None, reward: 2.36812136282
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.36812136282027, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.37)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: left, reward: 1.20274529851
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 1.2027452985080709, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.20)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 1.07400072024
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.0740007202418587, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.07)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 1.10141841595
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.1014184159478244, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.10)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 0.894437955078
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 11, 't': 9, 'action': None, 'reward': 0.8944379550779282, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 0.89)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 0.881364587082
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 10, 't': 10, 'action': None, 'reward': 0.8813645870820621, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 0.88)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 4), heading: (0, -1), action: forward, reward: 2.68376924488
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 2.683769244877694, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.68)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 229
\-------------------------
Environment.reset(): Trial set up with start = (2, 3), destination = (1, 6), deadline = 20
Simulating trial. . .
epsilon = 0.1023; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: right, reward: 1.45591581346
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.4559158134617662, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'left')
Agent drove right instead of left. (rewarded 1.46)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 1.4387003234
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'right'), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.43870032339689, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 1.44)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 2.72229766766
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.7222976676630344, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.72)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: left, reward: 2.31453085051
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 2.314530850513978, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.31)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: left, reward: 2.25879192291
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.258791922907062, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.26)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 1.17056629283
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.17056629283226, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.17)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: right, reward: 1.09292052985
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'left'), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.0929205298539475, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'left')
Agent followed the waypoint right. (rewarded 1.09)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: None, reward: 1.35730242765
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.3573024276542716, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.36)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: None, reward: 1.08569442493
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.0856944249281566, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.09)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: left, reward: 1.69731627375
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'left', 'reward': 1.6973162737482252, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.70)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: right, reward: 2.26306312438
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 2.2630631243794497, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent followed the waypoint right. (rewarded 2.26)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: right, reward: 1.92140626874
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 1.921406268740792, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.92)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 230
\-------------------------
Environment.reset(): Trial set up with start = (3, 3), destination = (6, 4), deadline = 20
Simulating trial. . .
epsilon = 0.1013; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: left, reward: 1.96167351432
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.9616735143213906, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.96)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: None, reward: 1.89308431822
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.8930843182168973, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.89)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: None, reward: 1.61481147245
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.6148114724508762, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.61)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: None, reward: 2.88468925942
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.8846892594182902, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.88)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: None, reward: 2.4461268728
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.446126872796726, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.45)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: left, reward: 2.39901569048
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 2.39901569048222, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.40)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: None, reward: 1.96970075269
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.9697007526917378, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.97)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: right, reward: 0.983726897179
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 0.983726897179038, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent drove right instead of forward. (rewarded 0.98)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: left, reward: 2.08694953855
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 12, 't': 8, 'action': 'left', 'reward': 2.0869495385492285, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.09)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: None, reward: 1.43382336662
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.4338233666225668, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.43)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: right, reward: -0.0790082103044
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': -0.07900821030442107, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded -0.08)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: left, reward: 1.78628184318
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 1.786281843181379, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 1.79)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: left, reward: 2.53074142897
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 8, 't': 12, 'action': 'left', 'reward': 2.5307414289689025, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 2.53)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: None, reward: 1.89462082681
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 7, 't': 13, 'action': None, 'reward': 1.8946208268107756, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.89)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: None, reward: 1.16753003524
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.1675300352417204, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.17)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: forward, reward: 1.09168275901
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': 1.0916827590111928, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.09)
20% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 231
\-------------------------
Environment.reset(): Trial set up with start = (1, 3), destination = (2, 6), deadline = 20
Simulating trial. . .
epsilon = 0.1003; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: right, reward: 1.14587423509
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.1458742350900988, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.15)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 0.433249616824
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 0.4332496168243015, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 0.43)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: None, reward: 1.17763584435
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.1776358443452328, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.18)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: None, reward: 1.98435257588
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.9843525758840053, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.98)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: forward, reward: 1.65090288285
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.6509028828534273, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.65)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: forward, reward: 1.46576250219
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.4657625021873981, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.47)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 232
\-------------------------
Environment.reset(): Trial set up with start = (1, 2), destination = (7, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0993; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: None, reward: 2.22009458624
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.2200945862423294, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.22)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: None, reward: 2.41362270685
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.413622706847695, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.41)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: None, reward: 2.43091137764
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.430911377636691, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.43)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: left, reward: 2.16341388424
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 2.163413884244193, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.16)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: 2.09735819648
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.0973581964776695, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.10)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: right, reward: 2.16449157183
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 2.1644915718332807, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.16)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: None, reward: 2.38571419619
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.3857141961923425, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.39)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: forward, reward: 1.7972784293
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.7972784293039166, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.80)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 233
\-------------------------
Environment.reset(): Trial set up with start = (6, 2), destination = (1, 4), deadline = 25
Simulating trial. . .
epsilon = 0.0983; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: right, reward: 1.90741882613
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.907418826132028, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.91)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: None, reward: 1.43599522769
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.4359952276947712, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.44)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: None, reward: 1.10632515832
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.1063251583242277, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.11)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: 2.64926262012
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.6492626201244693, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.65)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: forward, reward: 1.15686752343
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.1568675234286214, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.16)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: right, reward: 1.79848757798
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 1.7984875779847274, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.80)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: None, reward: 1.94928909152
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.9492890915238028, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.95)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: right, reward: 0.156621086701
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 0.15662108670132513, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent drove right instead of forward. (rewarded 0.16)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: None, reward: 1.45848846346
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.4584884634573863, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.46)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: left, reward: 1.0487145486
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'left', 'reward': 1.048714548598996, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.05)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: None, reward: 1.32832671257
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.3283267125678406, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.33)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: None, reward: 1.70284704769
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 11, 'action': None, 'reward': 1.702847047693002, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.70)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: None, reward: 1.60447274132
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 13, 't': 12, 'action': None, 'reward': 1.6044727413249538, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.60)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: left, reward: 2.43577361285
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'left', 'reward': 2.435773612847106, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.44)
44% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 234
\-------------------------
Environment.reset(): Trial set up with start = (1, 5), destination = (6, 3), deadline = 25
Simulating trial. . .
epsilon = 0.0973; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 2.44092983145
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 25, 't': 0, 'action': None, 'reward': 2.4409298314528654, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.44)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 1.58256726586
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.5825672658622738, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.58)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 1.33782900229
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.3378290022869934, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.34)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 1.67570668187
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.6757066818748045, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.68)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 2.82472897547
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.824728975473832, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.82)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: left, reward: 0.624607236191
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 0.6246072361911593, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 0.62)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: right, reward: 2.38229375297
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 2.3822937529664596, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.38)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: None, reward: 1.53556315088
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 18, 't': 7, 'action': None, 'reward': 1.535563150883467, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.54)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: None, reward: 1.06553369219
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.0655336921896144, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.07)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: None, reward: 2.00559269038
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 16, 't': 9, 'action': None, 'reward': 2.0055926903840797, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.01)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: right, reward: 1.02505604981
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'right'), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 1.0250560498098529, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'right')
Agent drove right instead of left. (rewarded 1.03)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: forward, reward: 1.50531652733
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 1.5053165273279103, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.51)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: None, reward: 2.06787680083
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 13, 't': 12, 'action': None, 'reward': 2.0678768008259922, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.07)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: forward, reward: 2.24884560048
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 2.24884560047734, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.25)
44% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 235
\-------------------------
Environment.reset(): Trial set up with start = (8, 6), destination = (5, 3), deadline = 30
Simulating trial. . .
epsilon = 0.0963; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: None, reward: 1.2171447706
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 30, 't': 0, 'action': None, 'reward': 1.2171447705971459, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.22)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: None, reward: 2.19664995471
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.196649954710508, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.20)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: None, reward: 1.74667189963
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.7466718996318553, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.75)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: None, reward: 1.59969191451
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.5996919145122561, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.60)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: left, reward: 2.8048242836
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 26, 't': 4, 'action': 'left', 'reward': 2.8048242835994937, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.80)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: forward, reward: 2.41686359565
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 2.4168635956481666, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.42)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: forward, reward: 1.15030594813
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.1503059481327091, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.15)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 1.03222307774
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 7, 'action': None, 'reward': 1.032223077744178, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.03)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 1.43018516935
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 22, 't': 8, 'action': None, 'reward': 1.4301851693464818, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.43)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 2.83306279404
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 21, 't': 9, 'action': None, 'reward': 2.8330627940413287, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.83)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: right, reward: -0.0271093261655
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'right'), 'deadline': 20, 't': 10, 'action': 'right', 'reward': -0.02710932616549422, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'right')
Agent drove right instead of left. (rewarded -0.03)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: right, reward: -19.2128056881
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 19, 't': 11, 'action': 'right', 'reward': -19.212805688078543, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -19.21)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: forward, reward: -9.13218913842
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 18, 't': 12, 'action': 'forward', 'reward': -9.132189138416914, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.13)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: forward, reward: 1.32359969183
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 1.323599691825144, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.32)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: forward, reward: 2.14704125591
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 14, 'action': 'forward', 'reward': 2.147041255913325, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.15)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 236
\-------------------------
Environment.reset(): Trial set up with start = (4, 3), destination = (8, 2), deadline = 25
Simulating trial. . .
epsilon = 0.0954; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: right, reward: 1.17915000124
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.1791500012368477, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.18)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: 2.39667614198
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.396676141979148, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.40)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: 1.18237147722
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.1823714772197407, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.18)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: forward, reward: 2.58244540889
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.582445408893929, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.58)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: forward, reward: 1.31954827693
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.319548276934282, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.32)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: None, reward: 2.21994642293
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.21994642293254, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.22)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: None, reward: 1.69961943252
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.6996194325214522, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.70)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: forward, reward: 2.77870973501
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.778709735014476, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.78)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: right, reward: 1.00907954585
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 1.0090795458466626, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.01)
64% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 237
\-------------------------
Environment.reset(): Trial set up with start = (8, 2), destination = (4, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0944; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: right, reward: 1.6824867883
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.6824867882990875, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'left')
Agent followed the waypoint right. (rewarded 1.68)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: right, reward: 1.27741249172
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.277412491716883, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.28)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 2.78686411347
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.7868641134680097, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.79)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: forward, reward: 0.98883712846
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 0.9888371284598523, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.99)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: forward, reward: 1.61426036786
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.614260367860206, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.61)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 1.90120144486
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.901201444857951, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.90)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 2.5679763536
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.5679763535993736, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.57)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: forward, reward: 2.06563855104
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.06563855104359, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.07)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: right, reward: 2.54821575141
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 2.5482157514135495, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.55)
55% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 238
\-------------------------
Environment.reset(): Trial set up with start = (1, 5), destination = (7, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0935; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: forward, reward: 1.21337642954
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'right'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.2133764295361678, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'right')
Agent drove forward instead of right. (rewarded 1.21)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: 1.61649891338
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.6164989133802847, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.62)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: right, reward: 2.84949299526
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.849492995259789, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.85)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: None, reward: 1.82685622227
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.8268562222746185, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.83)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: right, reward: 2.45954969493
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 2.4595496949260984, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.46)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: 1.14251842117
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.1425184211663, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.14)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: 1.90765259814
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.9076525981387717, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.91)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: forward, reward: 1.55940719441
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.559407194405468, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.56)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 2.30400203076
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.3040020307616587, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.30)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 1.73855484525
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.7385548452522464, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.74)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 1.44759981041
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.4475998104071501, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.45)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: left, reward: 1.20476069639
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 1.2047606963949054, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 1.20)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 239
\-------------------------
Environment.reset(): Trial set up with start = (4, 3), destination = (3, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0926; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: right, reward: 1.2021419509
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.2021419509011875, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 1.20)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: forward, reward: 1.55316057263
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'right'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 1.553160572627755, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'right')
Agent drove forward instead of right. (rewarded 1.55)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: right, reward: 1.29078435806
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.290784358056279, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.29)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: right, reward: 2.32632701765
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 2.3263270176485804, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.33)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 2.67103520386
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.6710352038563387, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.67)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 7), heading: (0, -1), action: left, reward: 2.75293106249
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 2.7529310624893055, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.75)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 7), heading: (0, -1), action: None, reward: 1.86532473463
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.8653247346322575, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.87)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: forward, reward: 1.13119284522
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.1311928452243518, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.13)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 240
\-------------------------
Environment.reset(): Trial set up with start = (1, 5), destination = (2, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0916; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: right, reward: 1.01311810776
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.0131181077607527, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.01)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: right, reward: 2.96208445459
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 2.9620844545859066, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.96)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: left, reward: 2.24143249387
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': 'left', 'reward': 2.2414324938684222, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.24)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: forward, reward: 2.69064151439
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.690641514388609, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.69)
80% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 241
\-------------------------
Environment.reset(): Trial set up with start = (5, 3), destination = (2, 4), deadline = 20
Simulating trial. . .
epsilon = 0.0907; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: 2.54876491265
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.5487649126484975, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.55)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: 2.49946108581
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.4994610858063764, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.50)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: 2.82152806418
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.8215280641790867, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.82)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: 2.57131593576
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.5713159357593565, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.57)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: 1.79462629816
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.7946262981588752, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.79)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: 2.32403321625
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.3240332162456436, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.32)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: forward, reward: 1.17866851828
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.178668518275867, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.18)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: None, reward: 1.83364717818
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.8336471781787553, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.83)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: None, reward: 2.3772287174
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.377228717395769, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.38)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: None, reward: 1.95469814972
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.9546981497185199, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.95)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: forward, reward: 2.43628126938
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 2.436281269377279, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.44)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: 2.03037547292
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 9, 't': 11, 'action': None, 'reward': 2.030375472924116, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.03)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: 1.76285031306
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 8, 't': 12, 'action': None, 'reward': 1.762850313056828, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.76)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: forward, reward: 1.31282984387
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 1.3128298438740496, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.31)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: None, reward: 2.33149176338
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 6, 't': 14, 'action': None, 'reward': 2.331491763377509, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.33)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: None, reward: 1.46566303774
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 5, 't': 15, 'action': None, 'reward': 1.4656630377415563, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.47)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: right, reward: 1.05316087065
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 4, 't': 16, 'action': 'right', 'reward': 1.0531608706508546, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 1.05)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: right, reward: 0.587268325943
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 3, 't': 17, 'action': 'right', 'reward': 0.5872683259425799, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 0.59)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: right, reward: 1.9756417187
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 2, 't': 18, 'action': 'right', 'reward': 1.9756417187033892, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.98)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: right, reward: 1.68879270658
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 1, 't': 19, 'action': 'right', 'reward': 1.6887927065828072, 'waypoint': 'right'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.69)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 242
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (4, 4), deadline = 25
Simulating trial. . .
epsilon = 0.0898; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: 1.54833895034
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 1.548338950337421, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 1.55)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: None, reward: 2.27846082664
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.278460826642537, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.28)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: None, reward: 1.86675843295
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.8667584329465583, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.87)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: None, reward: 2.26804545941
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.2680454594116224, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.27)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: None, reward: 2.34136425838
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.341364258381243, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.34)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: forward, reward: 2.29385798303
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 2.293857983033353, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 2.29)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: None, reward: 1.59243172655
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.5924317265455654, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.59)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: left, reward: 2.8593585634
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 2.8593585634030845, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.86)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: None, reward: 0.991880759568
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 0.991880759568424, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.99)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: forward, reward: 2.12357934658
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 2.1235793465848793, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.12)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: right, reward: 0.712618917687
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 0.7126189176866232, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.71)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: left, reward: 1.97267436463
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 14, 't': 11, 'action': 'left', 'reward': 1.9726743646323297, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.97)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: left, reward: 1.91423340697
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 13, 't': 12, 'action': 'left', 'reward': 1.9142334069695255, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.91)
48% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 243
\-------------------------
Environment.reset(): Trial set up with start = (7, 3), destination = (6, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0889; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: forward, reward: 2.1146643192
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 2.114664319202241, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.11)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: left, reward: -9.26067615558
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': -9.260676155577258, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent attempted driving left through a red light. (rewarded -9.26)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: right, reward: 2.69818724258
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.6981872425831264, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.70)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: forward, reward: 2.20362161623
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.2036216162317146, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.20)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: forward, reward: 2.480651047
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.480651047002697, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.48)
75% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 244
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (8, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0880; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: forward, reward: 2.44530607013
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 2.4453060701300497, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.45)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: forward, reward: 2.308912938
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 2.308912937996695, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.31)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: right, reward: 2.01067255509
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.010672555094049, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.01)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: None, reward: 1.34934828175
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.3493482817474385, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.35)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: forward, reward: 1.07159032758
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.0715903275784295, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.07)
75% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 245
\-------------------------
Environment.reset(): Trial set up with start = (1, 5), destination = (8, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0872; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: None, reward: 1.82854074754
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.828540747538194, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.83)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: right, reward: 1.59741181967
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.597411819667086, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.60)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: left, reward: 1.1813047527
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': 'left', 'reward': 1.181304752698349, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.18)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 1.49797472083
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.4979747208340883, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.50)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: forward, reward: 1.99889500359
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.9988950035857689, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.00)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: forward, reward: 2.29812009102
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.2981200910163775, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.30)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 246
\-------------------------
Environment.reset(): Trial set up with start = (8, 2), destination = (5, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0863; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: forward, reward: 0.776620785526
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'forward'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 0.7766207855255309, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'forward')
Agent drove forward instead of left. (rewarded 0.78)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: left, reward: 1.6002277659
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 24, 't': 1, 'action': 'left', 'reward': 1.6002277659003274, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.60)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: left, reward: 2.7253840806
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 23, 't': 2, 'action': 'left', 'reward': 2.7253840805984515, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.73)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 1.1705982782
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.170598278204838, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.17)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 1.21795731009
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.2179573100897443, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.22)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.63797479649
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.637974796486399, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.64)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.32784288673
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.327842886729386, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.33)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: right, reward: 1.45701916274
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 1.4570191627380396, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent drove right instead of forward. (rewarded 1.46)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: None, reward: 1.33676422225
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.3367642222549834, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.34)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: None, reward: 1.40146098696
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.4014609869591177, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.40)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: None, reward: 0.993856525785
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 15, 't': 10, 'action': None, 'reward': 0.9938565257853946, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.99)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: left, reward: 2.3852219219
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'left', 'reward': 2.3852219218968562, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.39)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: None, reward: 1.87704208005
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 12, 'action': None, 'reward': 1.8770420800474046, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.88)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: left, reward: 0.139550877913
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'left', 'reward': 0.1395508779126018, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 0.14)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: right, reward: 2.10996433814
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 11, 't': 14, 'action': 'right', 'reward': 2.1099643381406046, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.11)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: right, reward: 1.29919021005
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 10, 't': 15, 'action': 'right', 'reward': 1.2991902100533121, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.30)
36% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 247
\-------------------------
Environment.reset(): Trial set up with start = (4, 5), destination = (6, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0854; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: right, reward: 2.5036125473
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.5036125473047903, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.50)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: None, reward: 1.88283280243
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.8828328024281993, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.88)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: None, reward: 1.93920412168
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.9392041216755374, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.94)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: right, reward: 1.85771679355
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.8577167935484806, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 1.86)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: None, reward: 1.19088159639
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'right'), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.1908815963906827, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 1.19)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: None, reward: 1.16296016881
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.1629601688132438, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.16)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: forward, reward: 0.473281361312
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 0.4732813613121054, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 0.47)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: left, reward: 2.02927971824
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 2.029279718236901, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 2.03)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 248
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (3, 3), deadline = 25
Simulating trial. . .
epsilon = 0.0846; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: right, reward: 0.754416622814
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 0.7544166228135405, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.75)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: left, reward: 1.47894961751
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 24, 't': 1, 'action': 'left', 'reward': 1.478949617507736, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.48)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: None, reward: 1.34331520382
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.3433152038204843, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.34)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: None, reward: 1.07424660008
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.0742466000761566, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.07)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: right, reward: 1.07011907645
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 1.070119076451022, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.07)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: left, reward: 1.90949866797
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 1.9094986679669848, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.91)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: None, reward: 1.41595146772
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.4159514677208824, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.42)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: left, reward: 1.18136292316
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 1.1813629231575944, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.18)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: right, reward: 1.57124243781
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 1.5712424378096044, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 1.57)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: None, reward: 1.38361979518
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.3836197951803801, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.38)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: right, reward: 1.52212375864
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 1.522123758639161, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 1.52)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: right, reward: -20.71727381
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 3, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 14, 't': 11, 'action': 'right', 'reward': -20.717273810015907, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent attempted driving right through traffic and cause a minor accident. (rewarded -20.72)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: right, reward: 1.82028981744
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 1.8202898174387534, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.82)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: right, reward: 1.58767135138
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 12, 't': 13, 'action': 'right', 'reward': 1.5876713513795362, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.59)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: 0.779926722775
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 11, 't': 14, 'action': None, 'reward': 0.7799267227746902, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 0.78)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: 1.72096505517
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 10, 't': 15, 'action': None, 'reward': 1.7209650551749487, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.72)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: forward, reward: 1.66366109495
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 16, 'action': 'forward', 'reward': 1.663661094948748, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.66)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: forward, reward: 1.58841790021
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 8, 't': 17, 'action': 'forward', 'reward': 1.5884179002053564, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.59)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: left, reward: 1.31634844464
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 7, 't': 18, 'action': 'left', 'reward': 1.3163484446426805, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.32)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: None, reward: -5.1612238607
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 6, 't': 19, 'action': None, 'reward': -5.161223860699353, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.16)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: right, reward: 0.703416095233
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'forward'), 'deadline': 5, 't': 20, 'action': 'right', 'reward': 0.7034160952332185, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'forward')
Agent followed the waypoint right. (rewarded 0.70)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: right, reward: 2.10169196061
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 4, 't': 21, 'action': 'right', 'reward': 2.101691960609104, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.10)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: forward, reward: 1.31716132747
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 3, 't': 22, 'action': 'forward', 'reward': 1.3171613274671023, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.32)
8% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 249
\-------------------------
Environment.reset(): Trial set up with start = (8, 3), destination = (3, 5), deadline = 25
Simulating trial. . .
epsilon = 0.0837; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: None, reward: -5.82041863732
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 25, 't': 0, 'action': None, 'reward': -5.820418637323824, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent idled at a green light with no oncoming traffic. (rewarded -5.82)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: right, reward: 2.02967069916
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 2.0296706991565765, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.03)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 2.50015044036
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': 2.5001504403562165, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 2.50)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.82099097562
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.820990975622362, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.82)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.28424678091
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.284246780912202, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.28)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.59414527528
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.594145275277644, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.59)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 2.26451662852
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 2.2645166285221396, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 2.26)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: right, reward: 2.08168002538
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 2.081680025380348, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.08)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: forward, reward: 2.58624455074
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 2.5862445507376735, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 2.59)
64% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 250
\-------------------------
Environment.reset(): Trial set up with start = (3, 5), destination = (7, 2), deadline = 35
Simulating trial. . .
epsilon = 0.0829; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: left, reward: 2.65511763312
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 35, 't': 0, 'action': 'left', 'reward': 2.6551176331208595, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.66)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: forward, reward: 2.29617131527
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 34, 't': 1, 'action': 'forward', 'reward': 2.2961713152677676, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.30)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 1.12808357189
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 33, 't': 2, 'action': None, 'reward': 1.1280835718856708, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.13)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 1.0077831738
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 32, 't': 3, 'action': 'forward', 'reward': 1.0077831737990062, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.01)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 2.00952774081
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 31, 't': 4, 'action': 'forward', 'reward': 2.0095277408112238, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.01)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: left, reward: 1.87657132199
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 30, 't': 5, 'action': 'left', 'reward': 1.8765713219902744, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.88)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: forward, reward: 1.34117862243
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 29, 't': 6, 'action': 'forward', 'reward': 1.3411786224264135, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.34)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 1.15820313937
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 28, 't': 7, 'action': None, 'reward': 1.1582031393665486, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.16)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 2.00032983538
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 27, 't': 8, 'action': None, 'reward': 2.0003298353823653, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.00)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 2.60163250996
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 26, 't': 9, 'action': None, 'reward': 2.601632509958487, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.60)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: forward, reward: 2.76906985161
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 10, 'action': 'forward', 'reward': 2.769069851608301, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.77)
69% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 251
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (8, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0821; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: right, reward: 1.84082772494
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.840827724939171, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.84)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: left, reward: 0.196075677301
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': 0.19607567730142372, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 0.20)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: right, reward: 1.63725745252
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.637257452519183, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.64)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: None, reward: -4.92784124804
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 17, 't': 3, 'action': None, 'reward': -4.92784124804238, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.93)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: right, reward: 1.35572713635
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 1.3557271363471437, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.36)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: None, reward: 1.25422419817
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.2542241981747715, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.25)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: None, reward: 1.8669467496
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.8669467496018295, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.87)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: forward, reward: 2.77738428613
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.777384286134647, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.78)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 2.26438263187
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.264382631869336, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.26)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 1.40983071529
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.4098307152943406, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.41)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 2.44437097324
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 10, 't': 10, 'action': None, 'reward': 2.4443709732415257, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.44)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: forward, reward: 2.52070148132
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 2.52070148131771, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.52)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 252
\-------------------------
Environment.reset(): Trial set up with start = (5, 3), destination = (1, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0813; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: left, reward: 1.08586807122
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.0858680712168793, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent drove left instead of forward. (rewarded 1.09)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: right, reward: 2.28205181379
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 2.2820518137890957, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.28)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: None, reward: 1.76178585737
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.7617858573743705, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.76)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: forward, reward: 1.38919358689
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.3891935868904903, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.39)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: 2.11445776246
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.1144577624576577, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.11)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: 2.1886340693
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.1886340692985224, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.19)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: 2.40540354523
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.40540354523319, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.41)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: left, reward: 1.7644967169
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 1.7644967168986139, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 1.76)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: right, reward: 1.12133196416
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.1213319641623822, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent followed the waypoint right. (rewarded 1.12)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: right, reward: 1.48287177831
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 1.4828717783088479, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.48)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: forward, reward: 2.72337110184
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 2.723371101842683, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.72)
45% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 253
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (3, 3), deadline = 25
Simulating trial. . .
epsilon = 0.0805; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: right, reward: 2.62893294557
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.628932945570635, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.63)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: right, reward: 2.267844529
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 2.267844528997001, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.27)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: right, reward: -0.0115410747858
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': -0.011541074785828886, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent drove right instead of forward. (rewarded -0.01)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: None, reward: 2.24363745391
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.243637453907068, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.24)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: None, reward: 1.84304699802
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.843046998024664, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.84)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: left, reward: 1.62202929618
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 1.622029296182801, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.62)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: None, reward: 1.78461946491
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.7846194649135456, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.78)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: forward, reward: 2.71562970193
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.715629701931812, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.72)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: left, reward: 2.60755195769
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 8, 'action': 'left', 'reward': 2.607551957686662, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.61)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: forward, reward: 1.62753057239
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 1.6275305723927316, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.63)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 254
\-------------------------
Environment.reset(): Trial set up with start = (7, 4), destination = (4, 7), deadline = 30
Simulating trial. . .
epsilon = 0.0797; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: left, reward: 1.84041886597
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 30, 't': 0, 'action': 'left', 'reward': 1.840418865966993, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.84)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: forward, reward: 1.22051136404
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 29, 't': 1, 'action': 'forward', 'reward': 1.220511364037046, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.22)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: forward, reward: 2.28871394895
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 28, 't': 2, 'action': 'forward', 'reward': 2.2887139489455643, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.29)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 3), heading: (0, -1), action: right, reward: 1.90100982611
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 27, 't': 3, 'action': 'right', 'reward': 1.9010098261109845, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.90)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: right, reward: 0.517468286146
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 26, 't': 4, 'action': 'right', 'reward': 0.5174682861460284, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 0.52)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: None, reward: 2.85312864046
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 25, 't': 5, 'action': None, 'reward': 2.8531286404611595, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.85)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: left, reward: 2.65325452664
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 24, 't': 6, 'action': 'left', 'reward': 2.653254526637267, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 2.65)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: None, reward: 2.38771455481
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 7, 'action': None, 'reward': 2.3877145548091647, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.39)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: None, reward: 2.24467326482
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.2446732648208334, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.24)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: None, reward: 2.63015135043
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 21, 't': 9, 'action': None, 'reward': 2.6301513504268286, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.63)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: left, reward: 1.92167192377
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 20, 't': 10, 'action': 'left', 'reward': 1.9216719237698545, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 1.92)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: right, reward: 1.68200150298
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 1.6820015029777937, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.68)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 255
\-------------------------
Environment.reset(): Trial set up with start = (6, 4), destination = (3, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0789; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 2.09789093176
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.0978909317571115, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.10)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 1.88953992252
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.8895399225219618, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.89)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: forward, reward: -10.5022400873
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': -10.50224008725891, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.50)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: forward, reward: 2.54591467274
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.5459146727369513, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.55)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: forward, reward: 2.59043474509
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.590434745087971, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.59)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: forward, reward: 1.41052383051
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.4105238305148209, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.41)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 2.18331892038
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.1833189203759975, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.18)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 1.20845311014
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'left'), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.208453110138235, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'left')
Agent properly idled at a red light. (rewarded 1.21)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 2.34931305987
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.349313059866067, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.35)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: left, reward: 1.87146212077
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'left', 'reward': 1.871462120767719, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.87)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 256
\-------------------------
Environment.reset(): Trial set up with start = (6, 5), destination = (4, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0781; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: forward, reward: 1.18600276656
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.186002766560809, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.19)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: right, reward: 0.366590163094
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 0.36659016309372994, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'left')
Agent drove right instead of forward. (rewarded 0.37)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: left, reward: 1.75682599615
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': 1.7568259961453996, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.76)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 3), heading: (0, -1), action: right, reward: 1.14532769564
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'right'), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.1453276956396223, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'right')
Agent followed the waypoint right. (rewarded 1.15)
80% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 257
\-------------------------
Environment.reset(): Trial set up with start = (7, 3), destination = (2, 4), deadline = 20
Simulating trial. . .
epsilon = 0.0773; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: right, reward: 1.91413371634
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.9141337163415688, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.91)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 1.94017891035
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.9401789103524691, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.94)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 2.92205525597
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.922055255965457, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.92)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 1.73038829828
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.730388298279837, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.73)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 1.49871684797
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.4987168479692605, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.50)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 2.42153673661
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.4215367366079414, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.42)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 2.08875592764
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.088755927643372, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.09)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 0.92107520346
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 0.9210752034597569, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.92)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.78846570874
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.788465708736621, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.79)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 1.80004604184
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 1.8000460418424538, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.80)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 258
\-------------------------
Environment.reset(): Trial set up with start = (2, 4), destination = (4, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0765; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: left, reward: 2.51660328616
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 2.516603286162237, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.52)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: 2.63981251704
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.6398125170373854, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.64)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: 1.15769954642
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.1576995464216497, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.16)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: 1.4270182564
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.4270182564000078, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.43)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: -5.69828186376
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': None, 'reward': -5.698281863759396, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.70)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: left, reward: 2.61969792309
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 2.619697923087058, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.62)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: 1.4630056833
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.463005683299189, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.46)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: right, reward: 1.31615090666
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 1.3161509066596448, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.32)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 259
\-------------------------
Environment.reset(): Trial set up with start = (3, 3), destination = (8, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0758; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: right, reward: 1.12322477706
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.1232247770600563, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'left')
Agent drove right instead of left. (rewarded 1.12)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: right, reward: 1.37570000559
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.375700005591647, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.38)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: forward, reward: 2.82217925925
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 2.822179259251285, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.82)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: forward, reward: 2.58977435832
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.5897743583166886, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 2.59)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: right, reward: 2.6995096018
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 2.699509601802765, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.70)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: forward, reward: 1.39492390286
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.3949239028590454, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.39)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 260
\-------------------------
Environment.reset(): Trial set up with start = (5, 5), destination = (1, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0750; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: None, reward: 2.66747418325
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.6674741832541145, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.67)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: None, reward: 1.12644145426
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.1264414542571881, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 1.13)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: None, reward: 2.06459721228
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.064597212277137, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.06)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: forward, reward: 1.76051714208
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.7605171420779118, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.76)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: None, reward: 1.47321628918
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.4732162891827643, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.47)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: forward, reward: 2.01154679212
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.0115467921217327, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.01)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: forward, reward: 1.24892284879
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.2489228487863095, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.25)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 2.1813100795
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': 2.181310079496324, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.18)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 1.48537836332
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.4853783633176836, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.49)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: 2.58579395642
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 2.585793956418457, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.59)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 261
\-------------------------
Environment.reset(): Trial set up with start = (4, 2), destination = (8, 3), deadline = 25
Simulating trial. . .
epsilon = 0.0743; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: forward, reward: 2.95992068798
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 2.959920687977226, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.96)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 1.98363697069
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.9836369706852914, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 1.98)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 0.984732366545
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 0.9847323665450725, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.98)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 2.28548168932
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.2854816893233667, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.29)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 1.38347261815
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.3834726181505281, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.38)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: forward, reward: 2.85870871642
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 2.8587087164242857, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.86)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 2.35593608818
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.355936088176356, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.36)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 2.10927043503
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.109270435034185, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.11)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 1.7453594804
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 1.7453594804017454, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.75)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: -0.0127349570498
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'forward'), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': -0.012734957049787665, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'forward')
Agent drove forward instead of left. (rewarded -0.01)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: None, reward: 1.20985427434
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.209854274336439, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.21)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: right, reward: 1.25113091455
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'right'), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 1.2511309145529816, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'right')
Agent drove right instead of left. (rewarded 1.25)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: left, reward: 0.835803267064
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 13, 't': 12, 'action': 'left', 'reward': 0.8358032670636455, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 0.84)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: 2.09301091205
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 12, 't': 13, 'action': None, 'reward': 2.093010912049028, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.09)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: right, reward: 1.24793037115
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 11, 't': 14, 'action': 'right', 'reward': 1.2479303711518304, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent drove right instead of left. (rewarded 1.25)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: right, reward: 1.83493648052
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 10, 't': 15, 'action': 'right', 'reward': 1.8349364805242987, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.83)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: None, reward: 2.15302636433
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 9, 't': 16, 'action': None, 'reward': 2.1530263643272223, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.15)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: None, reward: 1.04812100486
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 8, 't': 17, 'action': None, 'reward': 1.0481210048635676, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.05)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: None, reward: 0.71924793435
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 7, 't': 18, 'action': None, 'reward': 0.7192479343497602, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 0.72)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: None, reward: 2.33725995994
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 6, 't': 19, 'action': None, 'reward': 2.337259959942742, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.34)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: None, reward: 0.415357540121
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 5, 't': 20, 'action': None, 'reward': 0.41535754012078874, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.42)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: forward, reward: 0.565395214575
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 4, 't': 21, 'action': 'forward', 'reward': 0.5653952145754089, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 0.57)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: right, reward: 1.30853370094
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 3, 't': 22, 'action': 'right', 'reward': 1.3085337009396096, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.31)
8% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: None, reward: 1.75899528012
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 2, 't': 23, 'action': None, 'reward': 1.7589952801173214, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.76)
4% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: None, reward: 0.237170446027
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 1, 't': 24, 'action': None, 'reward': 0.23717044602712267, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.24)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 262
\-------------------------
Environment.reset(): Trial set up with start = (5, 3), destination = (1, 6), deadline = 35
Simulating trial. . .
epsilon = 0.0735; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 1.76598390246
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 35, 't': 0, 'action': None, 'reward': 1.7659839024644128, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.77)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 2.27656204486
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 34, 't': 1, 'action': None, 'reward': 2.2765620448640123, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.28)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 1.22182551937
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 33, 't': 2, 'action': None, 'reward': 1.2218255193685699, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.22)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 1.82760525652
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 32, 't': 3, 'action': None, 'reward': 1.827605256524809, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.83)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 0.966884021575
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 31, 't': 4, 'action': None, 'reward': 0.9668840215750325, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.97)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 2.91422301408
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 30, 't': 5, 'action': None, 'reward': 2.9142230140768413, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.91)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: left, reward: 1.27169447518
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 29, 't': 6, 'action': 'left', 'reward': 1.2716944751767134, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.27)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: None, reward: 1.93737541088
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 28, 't': 7, 'action': None, 'reward': 1.9373754108805343, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 1.94)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: None, reward: 2.58145936619
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 27, 't': 8, 'action': None, 'reward': 2.5814593661850043, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.58)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: forward, reward: 2.38162696633
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 26, 't': 9, 'action': 'forward', 'reward': 2.3816269663318295, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.38)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: left, reward: -39.4988287117
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 25, 't': 10, 'action': 'left', 'reward': -39.49882871170898, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.50)
69% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: forward, reward: 1.35629811742
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 24, 't': 11, 'action': 'forward', 'reward': 1.3562981174206405, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.36)
66% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 2.67835415654
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 12, 'action': None, 'reward': 2.6783541565436497, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.68)
63% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 2.19929111188
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 22, 't': 13, 'action': 'forward', 'reward': 2.199291111881004, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.20)
60% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: left, reward: 1.3987033625
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 21, 't': 14, 'action': 'left', 'reward': 1.3987033625000282, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.40)
57% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: None, reward: 1.84307793874
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 15, 'action': None, 'reward': 1.8430779387414167, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.84)
54% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: None, reward: 2.72026350788
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 19, 't': 16, 'action': None, 'reward': 2.7202635078821418, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.72)
51% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: forward, reward: 1.46607021269
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 18, 't': 17, 'action': 'forward', 'reward': 1.4660702126907124, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.47)
49% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: None, reward: 1.31730122469
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 18, 'action': None, 'reward': 1.3173012246864912, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.32)
46% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: forward, reward: 1.92896461292
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 19, 'action': 'forward', 'reward': 1.9289646129192581, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.93)
43% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 263
\-------------------------
Environment.reset(): Trial set up with start = (4, 2), destination = (1, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0728; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 2.2365233823
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 25, 't': 0, 'action': None, 'reward': 2.2365233823001343, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.24)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 1.94710491706
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.947104917061721, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.95)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 2.842677377
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.8426773769988953, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.84)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: forward, reward: 1.06272359907
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.0627235990674164, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.06)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 1.23093751886
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.2309375188639042, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.23)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: left, reward: 1.46328338951
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 1.4632833895134376, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded 1.46)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: right, reward: 2.11306463148
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 2.113064631477803, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.11)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: forward, reward: 0.979338690627
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 0.9793386906274175, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 0.98)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: right, reward: 1.22305257514
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 1.223052575135456, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.22)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: forward, reward: -40.0400629608
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': -40.04006296078237, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -40.04)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: left, reward: -9.43755941844
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 10, 'action': 'left', 'reward': -9.437559418440076, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -9.44)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: forward, reward: 1.50740827317
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 1.507408273169839, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.51)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: None, reward: 2.27895193835
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 12, 'action': None, 'reward': 2.2789519383518857, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.28)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: forward, reward: 2.71668885305
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 2.7166888530471835, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 2.72)
44% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 264
\-------------------------
Environment.reset(): Trial set up with start = (1, 2), destination = (6, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0721; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: right, reward: 2.17834229473
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.178342294728225, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 2.18)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: left, reward: 1.15885402576
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': 1.15885402575731, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 1.16)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: right, reward: 1.11263207696
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.112632076958028, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.11)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: None, reward: 1.5927410829
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.5927410828987438, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 1.59)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: None, reward: 1.98334850739
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.983348507389154, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.98)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: forward, reward: 1.63696256781
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.6369625678082382, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.64)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: right, reward: 1.70530809024
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.70530809023736, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 1.71)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: forward, reward: 1.9649688072
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.9649688071962346, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.96)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 265
\-------------------------
Environment.reset(): Trial set up with start = (7, 4), destination = (5, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0714; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 1.38991240446
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.3899124044594329, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.39)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 1.60265965844
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.6026596584357309, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.60)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 1.29256121719
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.2925612171883445, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.29)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 2.75739167886
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.75739167885657, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.76)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: right, reward: 1.70628146297
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 1.706281462968751, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 1.71)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: left, reward: 1.78378392085
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 1.7837839208507815, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.78)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: forward, reward: 1.6207397595
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.620739759496495, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.62)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: right, reward: 2.60668992493
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 2.6066899249292574, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.61)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 266
\-------------------------
Environment.reset(): Trial set up with start = (5, 5), destination = (1, 3), deadline = 30
Simulating trial. . .
epsilon = 0.0707; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: None, reward: 2.79630825178
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 30, 't': 0, 'action': None, 'reward': 2.796308251777136, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.80)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: None, reward: 1.27587232255
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.275872322552645, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.28)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: None, reward: 2.79421855969
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.7942185596881117, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.79)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: right, reward: 0.00504213222351
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 27, 't': 3, 'action': 'right', 'reward': 0.005042132223505358, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.01)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: None, reward: 2.62108802076
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 26, 't': 4, 'action': None, 'reward': 2.6210880207557072, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.62)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 5), heading: (-1, 0), action: forward, reward: 1.49299466555
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 1.4929946655486388, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.49)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: forward, reward: 1.70823928933
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.7082392893255234, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.71)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: forward, reward: 2.44108233693
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 2.441082336933845, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.44)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: right, reward: 1.81235282998
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'left'), 'deadline': 22, 't': 8, 'action': 'right', 'reward': 1.8123528299848617, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'left')
Agent followed the waypoint right. (rewarded 1.81)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: None, reward: 2.21948965299
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 9, 'action': None, 'reward': 2.2194896529926353, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.22)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: None, reward: 1.08543953931
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 10, 'action': None, 'reward': 1.085439539307433, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.09)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: forward, reward: 1.80194332705
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 19, 't': 11, 'action': 'forward', 'reward': 1.8019433270528669, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 1.80)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 267
\-------------------------
Environment.reset(): Trial set up with start = (3, 5), destination = (7, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0699; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: None, reward: 1.1772637781
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.1772637780971626, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.18)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: None, reward: 1.00644307827
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.0064430782690923, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.01)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: right, reward: 2.33772321867
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 2.3377232186722647, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.34)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 2.86853282141
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.8685328214051347, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.87)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: forward, reward: 1.35756293969
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.3575629396908795, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.36)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 2.67351423302
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.6735142330211477, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.67)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 1.10821373485
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.1082137348480066, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.11)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 2.59224138532
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.5922413853154347, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.59)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 0.938736683489
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 0.9387366834891289, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.94)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: left, reward: 2.29788892658
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'left', 'reward': 2.297888926575118, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.30)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 268
\-------------------------
Environment.reset(): Trial set up with start = (2, 7), destination = (6, 2), deadline = 25
Simulating trial. . .
epsilon = 0.0693; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: forward, reward: 2.89121257275
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 2.8912125727494313, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 2.89)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: forward, reward: 2.26330076607
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 2.263300766067503, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 2.26)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 1.57643621018
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.5764362101825953, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.58)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 1.93540193713
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.935401937129734, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.94)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 0.951767247488
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 0.951767247488011, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.95)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.61722048357
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.617220483567561, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.62)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.01871607129
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.01871607128756, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.02)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 2.59961585736
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.599615857360445, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.60)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: left, reward: 1.87888640892
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 17, 't': 8, 'action': 'left', 'reward': 1.8788864089154682, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.88)
64% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 269
\-------------------------
Environment.reset(): Trial set up with start = (4, 5), destination = (8, 3), deadline = 30
Simulating trial. . .
epsilon = 0.0686; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: None, reward: 2.38222687519
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 30, 't': 0, 'action': None, 'reward': 2.3822268751872, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.38)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: None, reward: 1.0460773336
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.046077333602499, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.05)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: None, reward: 1.23377671379
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.2337767137941693, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.23)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: None, reward: 1.27334043463
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.2733404346278117, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.27)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 5), heading: (-1, 0), action: None, reward: 2.63314705309
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 26, 't': 4, 'action': None, 'reward': 2.6331470530888588, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.63)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 5), heading: (-1, 0), action: forward, reward: 2.53238336238
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 2.53238336237998, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.53)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: forward, reward: 1.21680860487
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.2168086048734577, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.22)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: forward, reward: 1.70905909451
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 1.709059094513182, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.71)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 2.60789276011
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 22, 't': 8, 'action': 'forward', 'reward': 2.6078927601066426, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.61)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: right, reward: 2.6050438054
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 21, 't': 9, 'action': 'right', 'reward': 2.605043805395608, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.61)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: forward, reward: 2.6908356329
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 2.6908356328955483, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.69)
63% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 270
\-------------------------
Environment.reset(): Trial set up with start = (1, 3), destination = (3, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0679; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: left, reward: 2.42579536794
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 2.4257953679444944, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.43)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.74721910355
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.7472191035497235, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.75)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.67915039645
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.679150396454528, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.68)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.37007197253
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.3700719725345698, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.37)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.4497033104
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.4497033104003982, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.45)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.56651998481
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.5665199848087314, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.57)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 1.58336063682
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 1.583360636818194, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove right instead of forward. (rewarded 1.58)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: left, reward: 1.00404812602
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 1.0040481260239391, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 1.00)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: right, reward: 1.67637946106
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 1.6763794610579004, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.68)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: None, reward: 1.90270350491
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.9027035049139434, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.90)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: forward, reward: 1.49101859473
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 1.4910185947290162, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.49)
56% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 271
\-------------------------
Environment.reset(): Trial set up with start = (2, 2), destination = (6, 4), deadline = 30
Simulating trial. . .
epsilon = 0.0672; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: right, reward: 1.0346360111
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'forward'), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 1.0346360110953752, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'forward')
Agent followed the waypoint right. (rewarded 1.03)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 1.08728134868
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.0872813486821173, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.09)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 2.88459504765
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.884595047649492, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.88)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 2.88510777228
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 27, 't': 3, 'action': 'forward', 'reward': 2.885107772282341, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.89)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 1.92414429145
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 26, 't': 4, 'action': None, 'reward': 1.9241442914466471, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.92)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 2.94127071936
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 25, 't': 5, 'action': None, 'reward': 2.941270719356138, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.94)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 1.20572475666
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 6, 'action': None, 'reward': 1.2057247566609772, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.21)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: 1.76452336443
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 1.7645233644330338, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.76)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: forward, reward: 2.37109705217
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 22, 't': 8, 'action': 'forward', 'reward': 2.371097052170817, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 2.37)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: None, reward: 1.13792342949
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 21, 't': 9, 'action': None, 'reward': 1.1379234294891267, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.14)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: None, reward: 1.25142312381
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'right'), 'deadline': 20, 't': 10, 'action': None, 'reward': 1.251423123808668, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 1.25)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: right, reward: 0.976401448382
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 0.9764014483823164, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.98)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: right, reward: 1.46855955297
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 18, 't': 12, 'action': 'right', 'reward': 1.4685595529749895, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 1.47)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: right, reward: 2.30330707783
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 17, 't': 13, 'action': 'right', 'reward': 2.303307077829722, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.30)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: left, reward: 0.913728439224
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 16, 't': 14, 'action': 'left', 'reward': 0.9137284392239853, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 0.91)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: right, reward: 1.5557119735
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 15, 't': 15, 'action': 'right', 'reward': 1.5557119735031724, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.56)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: right, reward: 1.83880694173
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 14, 't': 16, 'action': 'right', 'reward': 1.8388069417290143, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.84)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: None, reward: 2.2939093208
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 13, 't': 17, 'action': None, 'reward': 2.2939093207959718, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.29)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: None, reward: 2.27329270301
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 12, 't': 18, 'action': None, 'reward': 2.273292703013918, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.27)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: forward, reward: 2.38229843238
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 11, 't': 19, 'action': 'forward', 'reward': 2.3822984323795424, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.38)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: None, reward: 2.34790244557
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 10, 't': 20, 'action': None, 'reward': 2.347902445574843, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.35)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 4), heading: (0, 1), action: left, reward: 0.700046086519
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 9, 't': 21, 'action': 'left', 'reward': 0.7000460865193809, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.70)
27% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 272
\-------------------------
Environment.reset(): Trial set up with start = (7, 7), destination = (3, 4), deadline = 35
Simulating trial. . .
epsilon = 0.0665; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.55087457337
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 35, 't': 0, 'action': None, 'reward': 2.5508745733656593, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.55)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: -10.9042157833
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 34, 't': 1, 'action': 'forward', 'reward': -10.90421578325714, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.90)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.02641925018
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 33, 't': 2, 'action': None, 'reward': 1.0264192501778109, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.03)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.65697004697
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 32, 't': 3, 'action': None, 'reward': 1.6569700469700486, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.66)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: left, reward: 1.93463534835
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 31, 't': 4, 'action': 'left', 'reward': 1.9346353483511858, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 1.93)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: left, reward: 1.18656416716
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 30, 't': 5, 'action': 'left', 'reward': 1.1865641671611367, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.19)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: forward, reward: 2.6204973684
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 29, 't': 6, 'action': 'forward', 'reward': 2.6204973683992936, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.62)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: forward, reward: 1.45395537161
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 28, 't': 7, 'action': 'forward', 'reward': 1.4539553716088338, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 1.45)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: forward, reward: 2.33121814176
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 27, 't': 8, 'action': 'forward', 'reward': 2.331218141757689, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 2.33)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: right, reward: 2.78592913406
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 26, 't': 9, 'action': 'right', 'reward': 2.785929134058091, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.79)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: None, reward: 1.65591648261
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 25, 't': 10, 'action': None, 'reward': 1.6559164826070312, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.66)
69% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: forward, reward: 1.82857095675
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 24, 't': 11, 'action': 'forward', 'reward': 1.8285709567533608, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.83)
66% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 273
\-------------------------
Environment.reset(): Trial set up with start = (5, 5), destination = (1, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0659; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: right, reward: 2.68296317179
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', 'left'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.6829631717902602, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'left')
Agent followed the waypoint right. (rewarded 2.68)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: None, reward: 2.28728243236
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.287282432355573, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.29)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: None, reward: 1.82420770657
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.8242077065743476, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.82)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: forward, reward: 1.6646670877
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.6646670876995788, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.66)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 2.15253209758
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.152532097579698, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.15)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 1.23729976001
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.2372997600080735, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.24)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: forward, reward: 1.67216948088
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 1.672169480878582, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.67)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 1.66169330456
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 7, 'action': None, 'reward': 1.6616933045570939, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.66)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 2.87668348692
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.876683486920463, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.88)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: 0.925191340999
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 0.9251913409994028, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.93)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: right, reward: 0.990728790215
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 0.9907287902153947, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 0.99)
56% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 274
\-------------------------
Environment.reset(): Trial set up with start = (8, 5), destination = (1, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0652; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: 1.54474041002
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.5447404100221498, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.54)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: right, reward: 1.01145089947
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.0114508994734837, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.01)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: forward, reward: 2.07405013512
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 2.0740501351183442, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.07)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: forward, reward: 2.69935346347
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.699353463470928, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.70)
80% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 275
\-------------------------
Environment.reset(): Trial set up with start = (8, 5), destination = (5, 2), deadline = 30
Simulating trial. . .
epsilon = 0.0646; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: right, reward: 1.92823033918
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 1.9282303391818554, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.93)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: right, reward: 1.99540922905
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 29, 't': 1, 'action': 'right', 'reward': 1.995409229052395, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.00)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 1.46719809846
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.4671980984648525, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.47)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 2.54949141156
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 2.5494914115551035, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.55)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 2.92884602998
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 26, 't': 4, 'action': None, 'reward': 2.928846029977265, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.93)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 2.92826809444
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 25, 't': 5, 'action': None, 'reward': 2.928268094441692, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.93)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: left, reward: 1.60888312576
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 24, 't': 6, 'action': 'left', 'reward': 1.6088831257644425, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded 1.61)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: right, reward: 1.57162361624
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 23, 't': 7, 'action': 'right', 'reward': 1.571623616241788, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.57)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: 2.65167131677
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.6516713167724992, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.65)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: 2.1724191138
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 2.1724191138009075, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.17)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: left, reward: 1.7757943337
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 20, 't': 10, 'action': 'left', 'reward': 1.775794333701521, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.78)
63% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 276
\-------------------------
Environment.reset(): Trial set up with start = (8, 4), destination = (7, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0639; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: None, reward: 1.87901580022
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.8790158002205326, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.88)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: None, reward: 1.95119353739
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.951193537388813, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.95)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: None, reward: 2.60286789326
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.6028678932636664, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.60)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: None, reward: 2.67116557432
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.6711655743212317, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.67)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: forward, reward: 2.61607969248
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.6160796924795244, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.62)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: right, reward: 1.54618156757
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 1.5461815675669457, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.55)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: forward, reward: 1.062624639
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.0626246389957217, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.06)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: None, reward: 2.35328349634
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': 2.35328349634499, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.35)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: right, reward: 1.63715270439
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.637152704388328, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent drove right instead of forward. (rewarded 1.64)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: right, reward: 0.0977030087291
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 0.09770300872912951, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent drove right instead of left. (rewarded 0.10)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: right, reward: 1.47312638324
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 1.4731263832404884, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.47)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: right, reward: 2.55964338058
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 2.5596433805836982, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 2.56)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: None, reward: 1.18459565214
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 8, 't': 12, 'action': None, 'reward': 1.1845956521395928, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.18)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: forward, reward: 2.55042062289
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 2.550420622887185, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.55)
30% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 277
\-------------------------
Environment.reset(): Trial set up with start = (2, 5), destination = (8, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0633; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: left, reward: 1.01553867459
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.015538674592516, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.02)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 2.55898635574
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.5589863557405828, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.56)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 1.15214869215
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.1521486921525441, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.15)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 2.33151397291
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.3315139729098675, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.33)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: right, reward: 2.39486140799
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 2.394861407985081, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.39)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: 1.82960485879
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.8296048587918001, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.83)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: 2.33482012955
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.3348201295459283, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.33)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: forward, reward: 1.47410002171
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.4741000217056772, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 1.47)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 278
\-------------------------
Environment.reset(): Trial set up with start = (3, 4), destination = (8, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0627; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: right, reward: 1.84241757147
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.842417571472078, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.84)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: right, reward: 1.73974498263
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.739744982629857, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.74)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 1.16197335026
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.1619733502579765, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.16)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 1.69971328101
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.6997132810129527, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.70)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: forward, reward: 1.94494041147
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.9449404114692255, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.94)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 2.84539074271
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.845390742705827, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.85)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 279
\-------------------------
Environment.reset(): Trial set up with start = (2, 7), destination = (5, 5), deadline = 25
Simulating trial. . .
epsilon = 0.0620; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: forward, reward: 1.99841520431
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'forward'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 1.9984152043122285, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'forward')
Agent drove forward instead of right. (rewarded 2.00)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: 0.989290865318
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 24, 't': 1, 'action': None, 'reward': 0.98929086531759, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 0.99)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: 1.14247122361
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.1424712236055488, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.14)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: forward, reward: 2.36962423596
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.3696242359595496, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.37)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 2.04990823052
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.0499082305224112, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.05)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.92050803275
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.9205080327530517, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.92)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.06668820429
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.0666882042863288, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.07)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 2.18340146343
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.183401463434037, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.18)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: 1.3396304955
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.33963049549851, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.34)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: 2.69801647163
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 2.6980164716333714, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.70)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: right, reward: 2.61039296625
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 2.6103929662462155, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.61)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: left, reward: -0.0125660277946
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'left', 'reward': -0.012566027794602364, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent drove left instead of forward. (rewarded -0.01)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: right, reward: 2.51106781773
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 2.5110678177291033, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.51)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: right, reward: 2.38956356581
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 12, 't': 13, 'action': 'right', 'reward': 2.3895635658074275, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.39)
44% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 280
\-------------------------
Environment.reset(): Trial set up with start = (8, 5), destination = (5, 2), deadline = 30
Simulating trial. . .
epsilon = 0.0614; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: right, reward: 0.576030664229
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 0.5760306642293782, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent drove right instead of forward. (rewarded 0.58)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: 2.57844408216
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.5784440821598285, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.58)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: 1.62703386599
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.627033865988921, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.63)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: 1.06609458763
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.0660945876269654, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.07)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: right, reward: 0.370542307613
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 26, 't': 4, 'action': 'right', 'reward': 0.37054230761278617, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.37)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: left, reward: 2.51846276233
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 25, 't': 5, 'action': 'left', 'reward': 2.518462762332118, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.52)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: left, reward: 1.98530155916
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 24, 't': 6, 'action': 'left', 'reward': 1.9853015591595713, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.99)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: forward, reward: 1.65122858998
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 1.651228589977913, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.65)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: None, reward: 2.59737141743
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.5973714174340397, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.60)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: forward, reward: 1.21714710841
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 1.2171471084099668, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.22)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: None, reward: 1.67958565839
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 20, 't': 10, 'action': None, 'reward': 1.679585658394899, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.68)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: forward, reward: 2.7173830621
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 11, 'action': 'forward', 'reward': 2.7173830621045774, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.72)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: 1.13949370636
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 18, 't': 12, 'action': None, 'reward': 1.1394937063554873, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.14)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: right, reward: 1.76476451908
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 17, 't': 13, 'action': 'right', 'reward': 1.7647645190763888, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.76)
53% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 281
\-------------------------
Environment.reset(): Trial set up with start = (6, 4), destination = (8, 7), deadline = 25
Simulating trial. . .
epsilon = 0.0608; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: forward, reward: 2.76415799907
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 2.764157999071461, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.76)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: None, reward: 1.70469932648
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.7046993264817005, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.70)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: left, reward: -10.7073053453
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 23, 't': 2, 'action': 'left', 'reward': -10.707305345288624, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent attempted driving left through a red light. (rewarded -10.71)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: None, reward: 1.4557415566
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.4557415565974015, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.46)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: forward, reward: 0.964886346148
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 0.9648863461481527, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.96)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 2.21949909917
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'left'), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.21949909916803, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'left')
Agent properly idled at a red light. (rewarded 2.22)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 2.46417675233
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.464176752330909, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.46)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: left, reward: 2.86098293231
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 2.860982932314149, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.86)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: forward, reward: 0.953165926972
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 0.9531659269721733, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 0.95)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: None, reward: 2.56388863098
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 16, 't': 9, 'action': None, 'reward': 2.563888630976657, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 2.56)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: forward, reward: 0.853696757439
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 0.853696757438934, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.85)
56% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 282
\-------------------------
Environment.reset(): Trial set up with start = (7, 7), destination = (4, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0602; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: right, reward: 2.30457137378
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.3045713737828972, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.30)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: right, reward: 1.51964518162
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.5196451816193, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.52)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: None, reward: 1.52168912139
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.5216891213887707, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.52)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: forward, reward: 1.73462124594
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.734621245936882, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.73)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: forward, reward: 2.91245538848
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.9124553884761584, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.91)
75% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 283
\-------------------------
Environment.reset(): Trial set up with start = (8, 2), destination = (2, 4), deadline = 20
Simulating trial. . .
epsilon = 0.0596; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: right, reward: 1.48931221051
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.4893122105089334, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.49)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: right, reward: 1.50198177468
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.5019817746844797, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.50)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: forward, reward: 1.07372729755
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 1.0737272975520211, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.07)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: right, reward: 1.84671020146
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.846710201460085, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.85)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 2.18187045841
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.1818704584115247, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.18)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: forward, reward: 2.45353348972
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.4535334897197467, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 2.45)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: None, reward: 2.42893495905
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.4289349590532803, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.43)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: None, reward: 1.48565964673
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.4856596467307395, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.49)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: None, reward: 1.04156488256
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.0415648825574468, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.04)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: None, reward: 1.80178597081
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.8017859708070356, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.80)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: None, reward: 0.835109735031
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 10, 't': 10, 'action': None, 'reward': 0.8351097350307406, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.84)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: forward, reward: 1.40637439815
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 1.4063743981473207, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.41)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 284
\-------------------------
Environment.reset(): Trial set up with start = (1, 5), destination = (4, 2), deadline = 30
Simulating trial. . .
epsilon = 0.0590; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 0.453581564802
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 30, 't': 0, 'action': 'forward', 'reward': 0.4535815648017618, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 0.45)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: left, reward: 1.42433424954
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 29, 't': 1, 'action': 'left', 'reward': 1.4243342495437403, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.42)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: forward, reward: -10.1372363357
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': 'forward', 'reward': -10.13723633569576, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -10.14)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 1.52756172894
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.5275617289445949, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.53)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: left, reward: 1.65063620755
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 26, 't': 4, 'action': 'left', 'reward': 1.6506362075476122, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.65)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: 1.04043654521
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 1.0404365452051934, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.04)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 1.14084697583
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 24, 't': 6, 'action': None, 'reward': 1.1408469758326718, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 1.14)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 1.27778351242
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 7, 'action': None, 'reward': 1.2777835124204329, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.28)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 2.08419084512
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.0841908451165265, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.08)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: 2.08875066061
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 2.0887506606134334, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.09)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 2.42263943497
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 10, 'action': None, 'reward': 2.4226394349672713, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.42)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: 1.90677270392
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 11, 'action': 'forward', 'reward': 1.9067727039153075, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.91)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: forward, reward: 0.00939070860942
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 18, 't': 12, 'action': 'forward', 'reward': 0.009390708609423704, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent drove forward instead of right. (rewarded 0.01)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 0.600546860667
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 17, 't': 13, 'action': None, 'reward': 0.6005468606672704, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 0.60)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 0.576881872832
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 16, 't': 14, 'action': None, 'reward': 0.5768818728320944, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 0.58)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: right, reward: 1.00572797807
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 15, 't': 15, 'action': 'right', 'reward': 1.0057279780710964, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.01)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: right, reward: 1.515135085
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 14, 't': 16, 'action': 'right', 'reward': 1.515135084995562, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.52)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: None, reward: 0.94955650222
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 13, 't': 17, 'action': None, 'reward': 0.9495565022202024, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.95)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: None, reward: 1.98104025125
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 12, 't': 18, 'action': None, 'reward': 1.9810402512480245, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.98)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: right, reward: 1.17186294201
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 11, 't': 19, 'action': 'right', 'reward': 1.1718629420115714, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.17)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (4, 5), heading: (0, -1), action: forward, reward: 0.990978029181
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 10, 't': 20, 'action': 'forward', 'reward': 0.9909780291811472, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent drove forward instead of right. (rewarded 0.99)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: right, reward: 2.52681544368
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 9, 't': 21, 'action': 'right', 'reward': 2.5268154436760586, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.53)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: right, reward: 1.96864861549
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 8, 't': 22, 'action': 'right', 'reward': 1.9686486154893195, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.97)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: right, reward: 0.912293563569
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 7, 't': 23, 'action': 'right', 'reward': 0.9122935635693943, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 0.91)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: left, reward: 1.44613970694
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 6, 't': 24, 'action': 'left', 'reward': 1.4461397069427322, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.45)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: left, reward: -40.3752248265
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 5, 't': 25, 'action': 'left', 'reward': -40.37522482647651, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.38)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: 1.4551067485
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 4, 't': 26, 'action': None, 'reward': 1.4551067485007536, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.46)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: 0.70034876292
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 3, 't': 27, 'action': None, 'reward': 0.7003487629204923, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.70)
7% of time remaining to reach destination.
/-------------------
| Step 28 Results
\-------------------
Environment.step(): t = 28
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: right, reward: -0.338968889017
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 2, 't': 28, 'action': 'right', 'reward': -0.3389688890169654, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent drove right instead of forward. (rewarded -0.34)
3% of time remaining to reach destination.
/-------------------
| Step 29 Results
\-------------------
Environment.step(): t = 29
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: None, reward: 0.60359720902
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 1, 't': 29, 'action': None, 'reward': 0.6035972090197836, 'waypoint': 'left'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 0.60)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 285
\-------------------------
Environment.reset(): Trial set up with start = (2, 6), destination = (4, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0584; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 1.456132809
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.4561328089978758, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.46)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 1.92164862017
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.9216486201747238, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.92)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 1.42338115313
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.4233811531315774, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.42)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 2.34920030368
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.3492003036832823, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.35)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 1.11840277202
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.11840277202137, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.12)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: 1.27277547923
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.2727754792295356, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 1.27)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 0.897340654626
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 0.897340654626325, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.90)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: 1.90609840928
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.9060984092820257, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.91)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: right, reward: 2.65653496296
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 2.6565349629566386, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent followed the waypoint right. (rewarded 2.66)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: forward, reward: 2.7457464512
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 2.7457464511981984, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.75)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 286
\-------------------------
Environment.reset(): Trial set up with start = (5, 5), destination = (8, 2), deadline = 30
Simulating trial. . .
epsilon = 0.0578; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: None, reward: 2.95380939019
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 30, 't': 0, 'action': None, 'reward': 2.9538093901941727, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.95)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: None, reward: 1.24290957145
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.242909571446733, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.24)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: None, reward: 1.76109946103
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.7610994610344028, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.76)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: left, reward: 1.12709011924
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 27, 't': 3, 'action': 'left', 'reward': 1.1270901192376712, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.13)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: forward, reward: 0.994496129431
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 0.9944961294307959, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 0.99)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 1.66471371149
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 25, 't': 5, 'action': None, 'reward': 1.6647137114948463, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 1.66)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: -5.81430951305
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 24, 't': 6, 'action': None, 'reward': -5.814309513051519, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent idled at a green light with no oncoming traffic. (rewarded -5.81)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: forward, reward: 1.87690840676
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 1.8769084067601398, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.88)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: right, reward: 2.75581365498
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 22, 't': 8, 'action': 'right', 'reward': 2.7558136549769, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.76)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 0.924031011251
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 21, 't': 9, 'action': None, 'reward': 0.9240310112511074, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 0.92)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: forward, reward: 2.64801676737
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 2.6480167673677553, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.65)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: left, reward: 1.2847246197
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 11, 'action': 'left', 'reward': 1.2847246196995847, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.28)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: right, reward: 2.61632346019
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 18, 't': 12, 'action': 'right', 'reward': 2.6163234601928567, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.62)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: right, reward: 0.92812026768
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 17, 't': 13, 'action': 'right', 'reward': 0.9281202676797371, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 0.93)
53% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 287
\-------------------------
Environment.reset(): Trial set up with start = (6, 2), destination = (8, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0573; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: None, reward: 2.98892058143
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.988920581427518, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.99)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: None, reward: 1.24929539381
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.249295393812506, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.25)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: None, reward: 2.8476246105
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.847624610500533, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.85)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: left, reward: 2.25972164195
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 2.2597216419540596, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.26)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: None, reward: 2.32490236353
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.3249023635314034, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.32)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: 1.14761423623
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.147614236226223, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.15)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: left, reward: 2.04016744389
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 2.0401674438888078, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.04)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: None, reward: 0.991948157655
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 13, 't': 7, 'action': None, 'reward': 0.9919481576551936, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 0.99)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: None, reward: 2.80419215683
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.804192156831247, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.80)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: None, reward: 1.51086397787
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.5108639778680906, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.51)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: forward, reward: 1.99974376354
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 1.999743763543519, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 2.00)
45% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 288
\-------------------------
Environment.reset(): Trial set up with start = (8, 7), destination = (5, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0567; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: right, reward: 1.2095146872
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.2095146872039884, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.21)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 2.50969112981
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 2.5096911298075253, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.51)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: 1.21088343764
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.21088343763719, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.21)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: 2.08235320604
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.082353206044763, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.08)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: right, reward: 2.77360605534
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 2.7736060553362543, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.77)
75% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 289
\-------------------------
Environment.reset(): Trial set up with start = (2, 4), destination = (7, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0561; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: right, reward: 2.07044778927
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.0704477892708995, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'left')
Agent followed the waypoint right. (rewarded 2.07)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: 1.9412584542
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.9412584541950393, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.94)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: 1.10268070378
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.1026807037845592, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.10)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: forward, reward: 1.50801786728
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.5080178672797715, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.51)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: forward, reward: 1.82907637407
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.8290763740708047, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.83)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: right, reward: 1.35996978793
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 1.3599697879321333, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.36)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 290
\-------------------------
Environment.reset(): Trial set up with start = (6, 4), destination = (1, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0556; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: None, reward: 1.53560372663
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.5356037266344338, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 1.54)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: None, reward: 2.42834329183
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.428343291830342, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.43)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: None, reward: 1.8455083524
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.8455083524040758, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.85)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: forward, reward: 2.68149818727
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.6814981872689003, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.68)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: forward, reward: 1.6992650588
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.6992650588039306, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 1.70)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 2.6383316659
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.6383316659001097, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.64)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 1.18366460828
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.1836646082808002, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.18)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: forward, reward: 1.47983093822
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.4798309382229202, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.48)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: left, reward: 2.53019213043
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 12, 't': 8, 'action': 'left', 'reward': 2.5301921304259922, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.53)
55% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 291
\-------------------------
Environment.reset(): Trial set up with start = (4, 6), destination = (8, 2), deadline = 30
Simulating trial. . .
epsilon = 0.0550; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: 2.44153690446
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 30, 't': 0, 'action': None, 'reward': 2.441536904464927, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.44)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: 2.69410888246
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.694108882459055, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.69)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: 0.988085926811
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': 0.9880859268112163, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.99)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: None, reward: 2.47128899832
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 27, 't': 3, 'action': None, 'reward': 2.4712889983189252, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.47)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: left, reward: 1.87330857188
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 26, 't': 4, 'action': 'left', 'reward': 1.8733085718797309, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.87)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: forward, reward: 2.11384246249
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 2.113842462494732, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.11)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 2.8928578826
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.892857882597905, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.89)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: right, reward: 0.804947929033
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 7, 'action': 'right', 'reward': 0.8049479290327152, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.80)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: None, reward: 1.96116544626
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 22, 't': 8, 'action': None, 'reward': 1.9611654462583632, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.96)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 5), heading: (0, -1), action: None, reward: 2.64833753925
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 21, 't': 9, 'action': None, 'reward': 2.6483375392545785, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.65)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: left, reward: 1.06642087247
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 20, 't': 10, 'action': 'left', 'reward': 1.066420872473698, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.07)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 1.80862301047
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 19, 't': 11, 'action': 'forward', 'reward': 1.8086230104678096, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.81)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: left, reward: 2.450979101
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 18, 't': 12, 'action': 'left', 'reward': 2.4509791010005992, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.45)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: forward, reward: 2.58818471631
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 2.5881847163057254, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.59)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: forward, reward: 2.53366278772
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 16, 't': 14, 'action': 'forward', 'reward': 2.5336627877242766, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 2.53)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 292
\-------------------------
Environment.reset(): Trial set up with start = (7, 7), destination = (1, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0545; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 1.7274679182
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.7274679181975239, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.73)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 2.9154936439
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.915493643903896, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.92)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 2.89164522083
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.891645220832494, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.89)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 1.99431002056
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.9943100205644215, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.99)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: left, reward: 2.12505659296
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.1250565929623177, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.13)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: 2.2992164658
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.29921646579618, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.30)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 2.24235366169
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.242353661689715, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.24)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: left, reward: 1.27394323254
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 1.2739432325393878, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.27)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: None, reward: 1.67698031692
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.6769803169238375, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.68)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: None, reward: 1.46059410908
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.4605941090765757, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.46)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: forward, reward: 0.908133957374
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 0.9081339573737577, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent followed the waypoint forward. (rewarded 0.91)
45% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 293
\-------------------------
Environment.reset(): Trial set up with start = (8, 7), destination = (6, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0539; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: forward, reward: -39.7309569761
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': -39.73095697607416, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.73)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: None, reward: 1.62717247338
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.627172473376165, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.63)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: None, reward: 1.37114284191
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.3711428419149476, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.37)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: None, reward: 2.5511183949
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.5511183949002465, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.55)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: left, reward: 2.106420155
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.1064201549954795, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.11)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 1.57823744156
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.5782374415631746, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.58)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: -10.8553249559
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': -10.855324955905884, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.86)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: left, reward: 2.08736311912
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 2.087363119118856, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.09)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: None, reward: 2.38962056152
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.3896205615221837, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.39)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: forward, reward: 2.18802744352
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 2.188027443521732, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.19)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 294
\-------------------------
Environment.reset(): Trial set up with start = (5, 3), destination = (7, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0534; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 1.85553909699
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.8555390969920118, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.86)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 2.29409268146
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.2940926814585163, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.29)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 1.37349905051
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.3734990505061062, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.37)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 0.999634425223
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 0.9996344252229237, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.00)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 1.20553178733
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.2055317873293918, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.21)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 1.55019825437
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.5501982543727155, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.55)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: right, reward: 0.61865570069
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 0.6186557006899306, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.62)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: forward, reward: 1.82373022582
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'forward'), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.8237302258178771, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'forward')
Agent drove forward instead of right. (rewarded 1.82)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: None, reward: 2.75790117634
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.7579011763428722, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.76)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: forward, reward: 1.59455516878
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 1.59455516878, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.59)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: None, reward: 2.0002669191
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 10, 'action': None, 'reward': 2.0002669191023545, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.00)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: forward, reward: 1.53261690681
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 1.532616906814656, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 1.53)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: None, reward: 0.916557199918
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 12, 'action': None, 'reward': 0.9165571999178816, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.92)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: forward, reward: 0.865606753376
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 0.8656067533762581, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.87)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: None, reward: 0.723529159959
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'right'), 'deadline': 11, 't': 14, 'action': None, 'reward': 0.7235291599586289, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 0.72)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: None, reward: 1.65610483191
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 10, 't': 15, 'action': None, 'reward': 1.6561048319127607, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.66)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: None, reward: 2.41371488765
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 9, 't': 16, 'action': None, 'reward': 2.4137148876500873, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.41)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: forward, reward: 2.39724626206
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 8, 't': 17, 'action': 'forward', 'reward': 2.3972462620635886, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.40)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: right, reward: 1.44104090144
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 7, 't': 18, 'action': 'right', 'reward': 1.4410409014403636, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 1.44)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: None, reward: 1.73267835363
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 6, 't': 19, 'action': None, 'reward': 1.7326783536347823, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.73)
20% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: None, reward: 1.70433692531
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 5, 't': 20, 'action': None, 'reward': 1.7043369253078644, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.70)
16% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: forward, reward: 2.16863820529
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 4, 't': 21, 'action': 'forward', 'reward': 2.168638205285556, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.17)
12% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: forward, reward: 0.621686114416
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 3, 't': 22, 'action': 'forward', 'reward': 0.6216861144155188, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 0.62)
8% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 295
\-------------------------
Environment.reset(): Trial set up with start = (8, 6), destination = (4, 3), deadline = 35
Simulating trial. . .
epsilon = 0.0529; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 1.38263846284
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 35, 't': 0, 'action': None, 'reward': 1.3826384628424004, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.38)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 2.50931629094
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 34, 't': 1, 'action': None, 'reward': 2.509316290937938, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.51)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 1.40994084923
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 33, 't': 2, 'action': None, 'reward': 1.4099408492264618, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.41)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 1.99806348532
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 32, 't': 3, 'action': None, 'reward': 1.9980634853207917, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.00)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: left, reward: 2.9290334057
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 31, 't': 4, 'action': 'left', 'reward': 2.9290334056986653, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.93)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: 1.90671706701
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 30, 't': 5, 'action': 'forward', 'reward': 1.906717067005281, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.91)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: -39.7312557983
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 29, 't': 6, 'action': 'forward', 'reward': -39.73125579828291, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.73)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 1.52227175126
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 28, 't': 7, 'action': None, 'reward': 1.5222717512598118, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.52)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 1.92177772364
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 27, 't': 8, 'action': None, 'reward': 1.9217777236379672, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.92)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: 2.02248999233
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 26, 't': 9, 'action': 'forward', 'reward': 2.022489992333866, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.02)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 1.08375709866
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 25, 't': 10, 'action': None, 'reward': 1.0837570986575984, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.08)
69% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: 2.19040098889
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 24, 't': 11, 'action': 'forward', 'reward': 2.1904009888902554, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.19)
66% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: right, reward: 2.2665653447
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 23, 't': 12, 'action': 'right', 'reward': 2.266565344703369, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent followed the waypoint right. (rewarded 2.27)
63% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: 2.50672986404
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 22, 't': 13, 'action': None, 'reward': 2.5067298640392917, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 2.51)
60% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: 1.92341510563
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 14, 'action': None, 'reward': 1.9234151056290594, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.92)
57% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: 1.52604781816
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 20, 't': 15, 'action': None, 'reward': 1.5260478181608472, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.53)
54% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: right, reward: 1.35114401996
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 19, 't': 16, 'action': 'right', 'reward': 1.3511440199641767, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 1.35)
51% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: None, reward: 1.30035782773
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 18, 't': 17, 'action': None, 'reward': 1.3003578277269157, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.30)
49% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: None, reward: 1.40623524852
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 17, 't': 18, 'action': None, 'reward': 1.4062352485188927, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.41)
46% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: left, reward: 0.908434227819
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 19, 'action': 'left', 'reward': 0.9084342278193394, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.91)
43% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: left, reward: 2.28351269439
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 20, 'action': 'left', 'reward': 2.2835126943941377, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.28)
40% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: None, reward: 1.43157483206
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 14, 't': 21, 'action': None, 'reward': 1.4315748320608563, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.43)
37% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: right, reward: 1.96642712187
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 13, 't': 22, 'action': 'right', 'reward': 1.9664271218744662, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.97)
34% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 296
\-------------------------
Environment.reset(): Trial set up with start = (4, 2), destination = (1, 5), deadline = 30
Simulating trial. . .
epsilon = 0.0523; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: None, reward: 1.42320139853
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 30, 't': 0, 'action': None, 'reward': 1.4232013985309537, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.42)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: None, reward: 1.19242425265
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.192424252648433, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.19)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: None, reward: 1.44084453066
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.4408445306649107, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.44)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: right, reward: 0.736415352145
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 27, 't': 3, 'action': 'right', 'reward': 0.7364153521448203, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.74)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: 1.58180849933
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 26, 't': 4, 'action': None, 'reward': 1.5818084993317665, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.58)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: 1.25669428328
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 25, 't': 5, 'action': None, 'reward': 1.2566942832790058, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.26)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: 1.03534055785
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 24, 't': 6, 'action': None, 'reward': 1.035340557854097, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.04)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: forward, reward: 2.87027428176
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 2.8702742817642024, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.87)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: forward, reward: 2.64105927169
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 22, 't': 8, 'action': 'forward', 'reward': 2.6410592716926375, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.64)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: right, reward: 1.64685015054
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 9, 'action': 'right', 'reward': 1.6468501505385578, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 1.65)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: left, reward: 0.905678916811
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 10, 'action': 'left', 'reward': 0.9056789168108101, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.91)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: left, reward: 0.0919076786737
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 11, 'action': 'left', 'reward': 0.09190767867366889, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 0.09)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: right, reward: 2.52024213284
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 18, 't': 12, 'action': 'right', 'reward': 2.520242132839858, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.52)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 2.08361014454
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 13, 'action': None, 'reward': 2.083610144537033, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.08)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: forward, reward: -10.5037415808
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 14, 'action': 'forward', 'reward': -10.503741580809884, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.50)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: left, reward: 2.57977098037
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 15, 't': 15, 'action': 'left', 'reward': 2.5797709803661055, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 2.58)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: forward, reward: 1.22673625872
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 16, 'action': 'forward', 'reward': 1.2267362587196395, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.23)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: forward, reward: 1.32974657962
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 17, 'action': 'forward', 'reward': 1.3297465796171968, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.33)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 297
\-------------------------
Environment.reset(): Trial set up with start = (7, 3), destination = (2, 5), deadline = 25
Simulating trial. . .
epsilon = 0.0518; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 3), heading: (0, -1), action: None, reward: -4.49431164798
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 1, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 25, 't': 0, 'action': None, 'reward': -4.494311647977522, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent idled at a green light with no oncoming traffic. (rewarded -4.49)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: left, reward: 1.56751372631
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 24, 't': 1, 'action': 'left', 'reward': 1.5675137263080297, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 1.57)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: right, reward: 0.431873649959
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 0.4318736499589584, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent drove right instead of left. (rewarded 0.43)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: None, reward: 0.304027660479
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', 'forward'), 'deadline': 22, 't': 3, 'action': None, 'reward': 0.30402766047924845, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 0.30)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: right, reward: 1.17206150432
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 1.1720615043182039, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 1.17)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: None, reward: 2.65605550863
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.656055508625086, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.66)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: None, reward: 0.995724928842
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 0.9957249288422929, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.00)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: 1.18514039061
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.185140390605243, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.19)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: 2.7479657204
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.7479657204002974, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.75)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: 2.17320838842
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 16, 't': 9, 'action': None, 'reward': 2.1732083884218896, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.17)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: forward, reward: 1.19466999966
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 1.1946699996565906, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.19)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: forward, reward: 1.39236018846
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 1.392360188456501, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.39)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: right, reward: 0.463918999823
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 0.46391899982252194, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 0.46)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: forward, reward: 1.62672708183
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 1.626727081828636, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.63)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: forward, reward: 2.58462676369
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 11, 't': 14, 'action': 'forward', 'reward': 2.5846267636893066, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.58)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 298
\-------------------------
Environment.reset(): Trial set up with start = (5, 4), destination = (1, 2), deadline = 30
Simulating trial. . .
epsilon = 0.0513; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: left, reward: 0.980286225421
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 30, 't': 0, 'action': 'left', 'reward': 0.9802862254209261, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent drove left instead of forward. (rewarded 0.98)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: right, reward: 2.89347048089
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'left'), 'deadline': 29, 't': 1, 'action': 'right', 'reward': 2.893470480888894, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'left')
Agent followed the waypoint right. (rewarded 2.89)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: None, reward: 2.08792069822
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.0879206982224203, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.09)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: forward, reward: 1.88169765596
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 27, 't': 3, 'action': 'forward', 'reward': 1.8816976559644583, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.88)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: None, reward: 2.61915346396
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 26, 't': 4, 'action': None, 'reward': 2.619153463962918, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.62)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: forward, reward: 0.990060427837
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 0.9900604278368903, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.99)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 1.59841934912
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.598419349117181, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.60)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 0.462439957639
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 0.46243995763933976, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 0.46)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 1.0535877653
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 22, 't': 8, 'action': 'forward', 'reward': 1.0535877652959047, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 1.05)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 2.26884486376
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 21, 't': 9, 'action': None, 'reward': 2.2688448637579484, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.27)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 1.61254140182
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 20, 't': 10, 'action': None, 'reward': 1.6125414018156214, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.61)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: 0.349045722955
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'forward'), 'deadline': 19, 't': 11, 'action': 'forward', 'reward': 0.3490457229552545, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'forward')
Agent drove forward instead of left. (rewarded 0.35)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 2), heading: (0, -1), action: left, reward: 2.56032907509
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 18, 't': 12, 'action': 'left', 'reward': 2.5603290750861154, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.56)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: left, reward: 1.79812916764
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 17, 't': 13, 'action': 'left', 'reward': 1.7981291676355047, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 1.80)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: forward, reward: 1.43794363486
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 14, 'action': 'forward', 'reward': 1.4379436348600807, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.44)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 2.15652628632
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 15, 'action': None, 'reward': 2.1565262863201884, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.16)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 0.922789546623
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 14, 't': 16, 'action': None, 'reward': 0.922789546623249, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 0.92)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 0.900854184719
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 17, 'action': 'forward', 'reward': 0.9008541847194731, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.90)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 299
\-------------------------
Environment.reset(): Trial set up with start = (2, 7), destination = (1, 4), deadline = 20
Simulating trial. . .
epsilon = 0.0508; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: right, reward: 2.8704668959
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.8704668958961914, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.87)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: 2.43829393527
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.4382939352738826, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.44)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: 2.65827795419
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.6582779541852046, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.66)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: left, reward: 1.85791944907
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 1.857919449067414, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.86)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: forward, reward: 2.90331039345
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.9033103934502487, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.90)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: forward, reward: 2.62612929432
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.6261292943212196, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.63)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 300
\-------------------------
Environment.reset(): Trial set up with start = (4, 4), destination = (2, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0503; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: right, reward: 1.9708113158
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.9708113157955416, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.97)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 5), heading: (-1, 0), action: right, reward: 1.7150458014
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.7150458013954937, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.72)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: forward, reward: 0.991964721523
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 0.9919647215234155, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 0.99)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: None, reward: 1.2237077285
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.2237077284959792, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.22)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: left, reward: 2.47442885034
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.474428850342929, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.47)
75% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 301
\-------------------------
Environment.reset(): Trial set up with start = (7, 3), destination = (3, 4), deadline = 25
Simulating trial. . .
epsilon = 0.0498; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 4), heading: (0, 1), action: left, reward: 1.45522411675
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 1.455224116748246, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.46)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: left, reward: 2.86433994539
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 24, 't': 1, 'action': 'left', 'reward': 2.8643399453868366, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.86)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: forward, reward: 2.30947225046
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': 2.309472250458928, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.31)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: None, reward: 1.28789513392
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.2878951339214235, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.29)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: None, reward: 1.95196009468
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.951960094684487, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.95)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: forward, reward: 2.00011708065
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 2.0001170806527755, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.00)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 2.02002801952
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.020028019521106, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.02)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: forward, reward: 2.14023274446
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.140232744460601, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.14)
68% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 302
\-------------------------
Environment.reset(): Trial set up with start = (8, 7), destination = (6, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0493; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: right, reward: 1.28585810229
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.2858581022927007, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.29)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: right, reward: 2.45667692163
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 2.456676921627189, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.46)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: right, reward: 2.55194585985
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.551945859847791, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.55)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: 2.84109678432
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.8410967843164205, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.84)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: forward, reward: 1.37358556226
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.3735855622553588, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.37)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: None, reward: 2.84511860827
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.845118608267286, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.85)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: left, reward: -9.44219672018
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 14, 't': 6, 'action': 'left', 'reward': -9.442196720181432, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.44)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: right, reward: 0.967148679312
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 0.9671486793116426, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.97)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: right, reward: 2.82284773114
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 2.8228477311393094, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.82)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: right, reward: 0.890907463548
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 0.8909074635477912, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 0.89)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: right, reward: 1.35095288396
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 1.3509528839649496, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.35)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: left, reward: 2.5594166452
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 2.5594166451977856, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 2.56)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 303
\-------------------------
Environment.reset(): Trial set up with start = (8, 2), destination = (3, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0488; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 1.18683326003
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.1868332600346727, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.19)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 2.96738284586
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.967382845864276, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.97)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 2.72461934398
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.7246193439808537, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.72)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 1.2835744698
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.2835744698033902, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.28)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: left, reward: 2.91860157875
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.918601578753772, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.92)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: left, reward: 2.74402156884
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 2.744021568840007, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.74)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 1.45947889699
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.4594788969874437, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.46)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 1.62124837647
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.6212483764659111, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.62)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.01056499909
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.0105649990905485, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.01)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.76990447528
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 11, 't': 9, 'action': None, 'reward': 2.769904475283395, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.77)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 0.922623581039
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 10, 't': 10, 'action': None, 'reward': 0.922623581039365, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 0.92)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 2.5707272955
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 2.570727295499871, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.57)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 304
\-------------------------
Environment.reset(): Trial set up with start = (8, 4), destination = (5, 2), deadline = 25
Simulating trial. . .
epsilon = 0.0483; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: right, reward: 1.96063908848
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'left'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.9606390884789924, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'left')
Agent followed the waypoint right. (rewarded 1.96)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 2.39958696291
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.399586962913633, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.40)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 2.90989958881
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.9098995888097554, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.91)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 2.76268213059
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.762682130590756, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.76)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: forward, reward: 1.76349685072
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.763496850724592, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.76)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 2.17725989766
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.1772598976565782, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.18)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 1.29673076286
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.29673076285698, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.30)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: forward, reward: 1.84674992508
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.8467499250841988, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.85)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: right, reward: 2.10989158922
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 2.109891589219612, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.11)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: forward, reward: 1.30751132027
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 1.3075113202749675, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.31)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 305
\-------------------------
Environment.reset(): Trial set up with start = (4, 4), destination = (2, 7), deadline = 25
Simulating trial. . .
epsilon = 0.0478; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: right, reward: 2.15582035716
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'right'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.155820357155398, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'right')
Agent followed the waypoint right. (rewarded 2.16)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 1.62079923824
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.6207992382358742, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.62)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 1.1453437353
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.1453437352989462, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.15)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 2.55061539816
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.5506153981551436, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.55)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: None, reward: 1.50314258684
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.5031425868354604, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.50)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: left, reward: -19.3005177159
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 3, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': -19.300517715898092, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.30)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: forward, reward: 1.02203805979
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 1.0220380597925507, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.02)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: right, reward: 1.7742701387
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 1.7742701386966482, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.77)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: None, reward: 2.67930486713
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.679304867131595, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.68)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: None, reward: 1.82904300711
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.829043007107403, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.83)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: None, reward: 1.10407344914
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.1040734491403972, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.10)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: forward, reward: 1.64914340219
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 1.6491434021935858, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.65)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: None, reward: 1.50381734575
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 12, 'action': None, 'reward': 1.5038173457456865, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.50)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: forward, reward: 1.13655450966
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 1.1365545096574445, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.14)
44% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 306
\-------------------------
Environment.reset(): Trial set up with start = (4, 5), destination = (8, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0474; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: left, reward: 1.91125062148
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.9112506214810367, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.91)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: right, reward: 1.72039891911
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.720398919113664, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.72)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 5), heading: (0, -1), action: right, reward: 0.931120313707
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 0.931120313706637, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent drove right instead of forward. (rewarded 0.93)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 5), heading: (0, -1), action: None, reward: 1.36830964129
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.3683096412857778, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.37)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: right, reward: 1.72745466765
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 1.7274546676512763, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.73)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: right, reward: 1.84458484047
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 1.844584840466373, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.84)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: right, reward: 1.02510490834
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.0251049083399169, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.03)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: None, reward: 1.56800435533
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.568004355328128, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.57)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: None, reward: 2.56183545152
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.561835451520145, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.56)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: forward, reward: 2.01883643386
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 2.018836433855447, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.02)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: forward, reward: 1.35158242705
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 1.3515824270453762, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.35)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: 1.42366001247
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 1.4236600124719618, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.42)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 5), heading: (0, -1), action: right, reward: 0.732686586832
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 0.7326865868324146, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 0.73)
35% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 307
\-------------------------
Environment.reset(): Trial set up with start = (2, 2), destination = (6, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0469; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: right, reward: 1.69533229325
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.6953322932489476, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.70)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: right, reward: 2.41077664677
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 2.4107766467716454, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.41)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: forward, reward: 2.15247316351
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 2.152473163513765, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.15)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: forward, reward: 1.3566246792
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.356624679195661, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.36)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: forward, reward: 2.01705861714
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.017058617137935, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.02)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: right, reward: 1.19609133684
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'right'), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 1.1960913368441788, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'right')
Agent followed the waypoint right. (rewarded 1.20)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 308
\-------------------------
Environment.reset(): Trial set up with start = (5, 4), destination = (2, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0464; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: right, reward: 0.755513519407
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 0.755513519406767, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent drove right instead of left. (rewarded 0.76)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: None, reward: 1.38008050099
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.3800805009867427, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.38)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: None, reward: 2.16898142514
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.1689814251364057, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.17)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: forward, reward: 2.5944135877
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.5944135877011885, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.59)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: forward, reward: 2.65706831978
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.657068319782695, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.66)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 1.61079085156
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.6107908515646538, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 1.61)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 2.26500451693
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.2650045169334017, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.27)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: forward, reward: 2.85779025228
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.8577902522773946, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 2.86)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: forward, reward: 1.9721027175
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 1.9721027175041057, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.97)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: right, reward: 2.51941280174
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 2.5194128017398736, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.52)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 309
\-------------------------
Environment.reset(): Trial set up with start = (1, 7), destination = (5, 5), deadline = 30
Simulating trial. . .
epsilon = 0.0460; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: left, reward: 1.9843109899
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 30, 't': 0, 'action': 'left', 'reward': 1.9843109898966114, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.98)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: right, reward: 0.928984019533
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 29, 't': 1, 'action': 'right', 'reward': 0.9289840195326496, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 0.93)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: 1.59631246348
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 28, 't': 2, 'action': 'forward', 'reward': 1.59631246348159, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.60)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 2.35915800878
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 27, 't': 3, 'action': None, 'reward': 2.3591580087844175, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.36)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 1.67527770391
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 26, 't': 4, 'action': None, 'reward': 1.6752777039075823, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.68)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: 2.16381687367
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 2.163816873671389, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.16)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: forward, reward: 1.97284961985
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.972849619852387, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 1.97)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: left, reward: 0.963128415984
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'left', 'reward': 0.9631284159843079, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.96)
73% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 310
\-------------------------
Environment.reset(): Trial set up with start = (2, 2), destination = (6, 4), deadline = 30
Simulating trial. . .
epsilon = 0.0455; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 1.98767433582
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 30, 't': 0, 'action': 'forward', 'reward': 1.9876743358199407, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.99)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 2.84778107337
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 29, 't': 1, 'action': 'forward', 'reward': 2.8477810733689717, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.85)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 1.23325377797
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.2332537779730666, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.23)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 1.94935489334
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.949354893339979, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.95)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: 2.83369653219
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 2.833696532187121, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.83)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: forward, reward: 2.90648694148
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 2.906486941481016, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.91)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: left, reward: 1.84130432336
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 24, 't': 6, 'action': 'left', 'reward': 1.8413043233573498, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.84)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: None, reward: 1.7890869535
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 23, 't': 7, 'action': None, 'reward': 1.789086953501541, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.79)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: None, reward: 2.13656724461
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.1365672446096013, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.14)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 4), heading: (0, 1), action: forward, reward: 1.78762832702
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 1.787628327017352, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.79)
67% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 311
\-------------------------
Environment.reset(): Trial set up with start = (7, 7), destination = (3, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0450; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 1.61800410392
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.6180041039223985, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.62)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 1.34226992804
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.3422699280439285, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.34)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 1.6843337142
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.684333714200434, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 1.68)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 1.51595599754
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.5159559975422963, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 1.52)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: left, reward: 1.846803427
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 1.8468034270025033, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.85)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: 2.21433179852
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 2.214331798515374, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.21)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 1.78482739204
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.7848273920367959, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.78)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: forward, reward: 1.74354468483
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.7435446848341396, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 1.74)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: forward, reward: 1.09106466389
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 1.0910646638947283, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.09)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 1.8650886672
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.8650886671991058, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.87)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 1.36961081715
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.3696108171500678, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.37)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: left, reward: 2.01781696166
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 14, 't': 11, 'action': 'left', 'reward': 2.0178169616639687, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.02)
52% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 312
\-------------------------
Environment.reset(): Trial set up with start = (2, 2), destination = (5, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0446; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: right, reward: 1.90478895409
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.9047889540904366, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 1.90)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 1.31602182029
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.3160218202898013, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.32)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 1.3028469556
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.302846955598172, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.30)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 2.16732117178
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.1673211717833425, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.17)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: 1.37790682012
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.377906820123507, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.38)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: None, reward: 2.51547768831
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.515477688306551, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.52)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: None, reward: 1.00267546628
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.0026754662844513, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.00)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: forward, reward: 2.26957405684
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.2695740568418885, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.27)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: forward, reward: 2.68429119038
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 2.684291190377089, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.68)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: right, reward: 2.4307025957
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', 'left'), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 2.430702595702729, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'left')
Agent followed the waypoint right. (rewarded 2.43)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: forward, reward: 1.71114416043
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 1.7111441604291173, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.71)
56% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 313
\-------------------------
Environment.reset(): Trial set up with start = (4, 2), destination = (6, 5), deadline = 25
Simulating trial. . .
epsilon = 0.0442; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: right, reward: 1.25237125311
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.2523712531054922, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.25)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: forward, reward: 1.85836187825
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 1.858361878254864, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.86)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: left, reward: 2.75388083607
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 23, 't': 2, 'action': 'left', 'reward': 2.753880836068392, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.75)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: forward, reward: 0.997797032544
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 0.9977970325435601, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.00)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: forward, reward: 2.09468637352
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.0946863735159784, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.09)
80% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 314
\-------------------------
Environment.reset(): Trial set up with start = (6, 2), destination = (8, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0437; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: None, reward: 0.467996676642
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 20, 't': 0, 'action': None, 'reward': 0.4679966766418566, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 0.47)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: None, reward: 0.967112199947
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 19, 't': 1, 'action': None, 'reward': 0.9671121999471054, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 0.97)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: left, reward: -9.09705301365
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': -9.097053013646555, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent attempted driving left through a red light. (rewarded -9.10)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: right, reward: 1.34799738916
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.3479973891611554, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.35)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: None, reward: 2.00848875835
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.0084887583473696, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.01)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: 2.62370987015
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.623709870152596, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.62)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: left, reward: 1.81357400198
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 1.813574001978042, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.81)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: None, reward: 2.19978495952
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': 2.1997849595204935, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.20)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: None, reward: 0.902360864036
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': 0.9023608640359033, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.90)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: None, reward: 2.65648930624
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 9, 'action': None, 'reward': 2.65648930624109, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.66)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: forward, reward: 0.888046432459
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 0.8880464324594053, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.89)
45% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 315
\-------------------------
Environment.reset(): Trial set up with start = (2, 5), destination = (6, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0433; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: right, reward: 1.79320192712
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'right'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.7932019271201813, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'right')
Agent followed the waypoint right. (rewarded 1.79)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: right, reward: 1.45040322152
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.4504032215165648, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'left')
Agent followed the waypoint right. (rewarded 1.45)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: 1.29323262864
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 1.2932326286373774, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.29)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: 1.44361457213
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.443614572129385, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.44)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: forward, reward: 1.49008082005
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.4900808200531184, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.49)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 2.02292023716
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.0229202371586053, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.02)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: forward, reward: 0.900444228071
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 0.9004442280706966, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 0.90)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: right, reward: 1.65240878714
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 1.6524087871429778, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.65)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 316
\-------------------------
Environment.reset(): Trial set up with start = (3, 5), destination = (1, 2), deadline = 25
Simulating trial. . .
epsilon = 0.0429; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: forward, reward: 1.51912255933
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 1.519122559326286, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 1.52)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: left, reward: 1.1026500021
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 24, 't': 1, 'action': 'left', 'reward': 1.1026500021040917, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.10)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: right, reward: 1.56229291507
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.562292915073319, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent drove right instead of forward. (rewarded 1.56)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: None, reward: 2.86423516995
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.8642351699493833, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.86)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: None, reward: 2.13311228422
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.133112284216769, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.13)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: None, reward: 2.30229300745
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.302293007447609, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.30)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: right, reward: 1.09892204139
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 1.0989220413893124, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 1.10)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: left, reward: 1.99281432647
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 1.992814326465678, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.99)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: left, reward: 2.76688100588
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 17, 't': 8, 'action': 'left', 'reward': 2.7668810058829827, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 2.77)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 1.07358774632
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 1.0735877463244201, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 1.07)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 317
\-------------------------
Environment.reset(): Trial set up with start = (8, 4), destination = (2, 7), deadline = 25
Simulating trial. . .
epsilon = 0.0424; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: right, reward: 1.50552817976
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'right'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.5055281797572875, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'right')
Agent followed the waypoint right. (rewarded 1.51)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: right, reward: 1.35316172181
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'left'), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.353161721805508, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'left')
Agent followed the waypoint right. (rewarded 1.35)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 1.20924306803
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.2092430680250508, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.21)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 2.23131201801
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.231312018009741, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.23)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.29540162692
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.2954016269217934, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.30)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.81084048549
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.8108404854923563, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.81)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: left, reward: 2.12244848413
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 19, 't': 6, 'action': 'left', 'reward': 2.1224484841313105, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.12)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: forward, reward: 2.00684659196
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.0068465919612164, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.01)
68% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 318
\-------------------------
Environment.reset(): Trial set up with start = (8, 5), destination = (4, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0420; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: 2.83237153596
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 2.8323715359593566, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.83)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: None, reward: 2.59695225723
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.596952257231867, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.60)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: None, reward: 2.38152584293
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.3815258429341526, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.38)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: forward, reward: 2.25144635769
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.251446357686464, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.25)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: forward, reward: 1.9326747691
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.9326747690983603, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.93)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 1.25636160443
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.2563616044326966, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.26)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 2.42830794443
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.4283079444344406, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.43)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: 1.81109897233
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.8110989723285762, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.81)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 319
\-------------------------
Environment.reset(): Trial set up with start = (4, 5), destination = (7, 7), deadline = 25
Simulating trial. . .
epsilon = 0.0416; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: left, reward: 2.55553695036
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 2.5555369503584124, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 2.56)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: forward, reward: 1.93818733273
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 1.9381873327288826, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.94)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: None, reward: 1.60545722208
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.6054572220843406, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.61)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: forward, reward: 1.3564056684
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.35640566839655, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.36)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: right, reward: 1.7744485052
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 1.7744485051957124, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.77)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: right, reward: 0.149477354303
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 0.1494773543033575, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent drove right instead of forward. (rewarded 0.15)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: left, reward: 2.83052097564
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'left', 'reward': 2.8305209756436382, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.83)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: left, reward: 2.35365902802
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 2.353659028019547, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.35)
68% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 320
\-------------------------
Environment.reset(): Trial set up with start = (2, 4), destination = (7, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0412; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: right, reward: 2.96947463227
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'left'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.969474632272205, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'left')
Agent followed the waypoint right. (rewarded 2.97)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: right, reward: 1.1848799811
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.1848799810956736, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.18)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 1.3789629745
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.3789629745013163, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.38)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 1.20183617946
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.2018361794626364, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.20)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 0.982568237189
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 0.9825682371894322, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 0.98)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: left, reward: 1.56273700498
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 1.5627370049759592, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.56)
76% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 321
\-------------------------
Environment.reset(): Trial set up with start = (1, 3), destination = (5, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0408; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: None, reward: 1.22521779376
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.2252177937565742, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.23)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: None, reward: 2.2984967774
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.2984967774006217, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.30)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: None, reward: 2.39873418763
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.3987341876337975, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.40)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: left, reward: 1.51244187724
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 1.5124418772370918, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.51)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: forward, reward: 1.99581173234
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.995811732342306, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.00)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: None, reward: 1.15587398637
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.1558739863701684, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.16)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: None, reward: 1.88725364881
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.8872536488087752, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.89)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: forward, reward: 1.5379471986
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.5379471986000148, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.54)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: None, reward: 1.75048060042
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.7504806004157527, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 1.75)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: forward, reward: 2.79217757333
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 2.792177573332665, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.79)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 322
\-------------------------
Environment.reset(): Trial set up with start = (1, 7), destination = (5, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0404; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: forward, reward: 1.95969965756
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.9596996575564418, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.96)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 1.07072458181
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 1.0707245818052669, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.07)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.32739494338
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.3273949433796703, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.33)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.1931008903
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.193100890302315, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.19)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 2.09544735722
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.095447357223902, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.10)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: 1.49207825102
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.492078251015854, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.49)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: 2.46171212402
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.4617121240167767, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.46)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: 1.38933405491
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.389334054910735, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.39)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 323
\-------------------------
Environment.reset(): Trial set up with start = (8, 6), destination = (2, 3), deadline = 25
Simulating trial. . .
epsilon = 0.0400; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 2.92931423397
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 25, 't': 0, 'action': None, 'reward': 2.929314233973745, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.93)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 2.77069377279
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.770693772787247, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.77)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 2.12216575325
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.122165753253113, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.12)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 2.94563384677
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.9456338467674454, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.95)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: left, reward: 2.46642378841
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 2.4664237884139, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.47)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: 1.03489454837
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.0348945483711152, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.03)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 0.889981240205
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 19, 't': 6, 'action': None, 'reward': 0.8899812402049933, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 0.89)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: right, reward: 1.51490535721
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 1.5149053572113504, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.51)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: forward, reward: 2.50297480866
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 2.5029748086576133, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 2.50)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 0.963918300064
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': 0.9639183000644078, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.96)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 2.12002272763
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 10, 'action': None, 'reward': 2.120022727625814, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.12)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: forward, reward: 2.64621775449
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 2.646217754485524, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.65)
52% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 324
\-------------------------
Environment.reset(): Trial set up with start = (2, 3), destination = (5, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0396; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: right, reward: 2.89527567619
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.8952756761888123, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.90)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: 2.06294730683
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 2.062947306832979, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.06)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: right, reward: 1.6688070888
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.6688070887988138, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent drove right instead of forward. (rewarded 1.67)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: None, reward: 2.31149934973
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.311499349733414, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.31)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: None, reward: 2.35901639385
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.3590163938489632, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.36)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: None, reward: 1.65380712327
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'right'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.6538071232716738, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 1.65)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: right, reward: 1.65292953791
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.65292953790978, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.65)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: right, reward: 2.60331475879
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'left'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 2.603314758787419, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'left')
Agent followed the waypoint right. (rewarded 2.60)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: right, reward: 1.56472251301
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.5647225130086693, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.56)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: forward, reward: 1.73101328034
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 1.7310132803434248, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.73)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: left, reward: 0.805593726562
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'left', 'reward': 0.8055937265621318, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 0.81)
45% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 325
\-------------------------
Environment.reset(): Trial set up with start = (1, 7), destination = (6, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0392; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 2.94675571013
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.946755710130012, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.95)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 1.72063458653
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 1.7206345865278312, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.72)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.71149230491
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.711492304909417, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.71)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.32269416756
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.322694167559609, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.32)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: left, reward: 1.08041191112
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 1.080411911119585, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 1.08)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: right, reward: 2.2483887658
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 2.248388765799004, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.25)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 326
\-------------------------
Environment.reset(): Trial set up with start = (3, 3), destination = (5, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0388; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (-1, 0), action: right, reward: 1.56217416869
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.5621741686860005, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 1.56)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 2), heading: (0, -1), action: right, reward: 0.760588154242
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 0.760588154242145, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 0.76)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: right, reward: 2.4294200521
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.429420052102241, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.43)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: forward, reward: 2.78899457241
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.7889945724062324, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.79)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: 2.08920208602
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.089202086018739, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.09)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: 1.89673725889
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.896737258889682, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.90)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: 1.0966553447
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.0966553447025134, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.10)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: right, reward: 0.519605277792
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'left'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 0.5196052777920317, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'left')
Agent drove right instead of left. (rewarded 0.52)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: forward, reward: 1.03228996138
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 1.0322899613782293, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.03)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 2.64388720982
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 11, 't': 9, 'action': None, 'reward': 2.6438872098182244, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.64)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 1.99602197038
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.996021970384171, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.00)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: forward, reward: 2.54392165568
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 2.5439216556837323, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.54)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 327
\-------------------------
Environment.reset(): Trial set up with start = (8, 3), destination = (3, 4), deadline = 20
Simulating trial. . .
epsilon = 0.0384; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 1.18060117423
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.1806011742279026, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.18)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 2.47119426935
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 2.4711942693493096, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.47)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 2.89792252727
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 2.8979225272746314, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.90)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: right, reward: 2.51894336567
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 2.5189433656702747, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.52)
80% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 328
\-------------------------
Environment.reset(): Trial set up with start = (8, 3), destination = (4, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0380; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 2.13455016905
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 2.1345501690513142, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.13)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 1.42715939559
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.4271593955887019, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.43)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 1.89725123235
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.897251232347039, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.90)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 2.12993061582
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.1299306158206073, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.13)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 2.40060743979
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.4006074397944577, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.40)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 2.20021721208
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.200217212078701, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.20)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 2.33403956068
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.334039560683916, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.33)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: 2.08493368534
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.0849336853369094, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.08)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 329
\-------------------------
Environment.reset(): Trial set up with start = (5, 2), destination = (2, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0376; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: forward, reward: 1.95387699874
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.9538769987443425, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.95)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: forward, reward: 1.99629021618
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 1.9962902161773768, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.00)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 2.93051378321
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.9305137832053187, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.93)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 1.94033758563
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.9403375856311031, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.94)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 2.14739154152
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.147391541521081, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.15)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: forward, reward: 1.47976884121
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.479768841214242, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.48)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 2.15696479458
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'right'), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.156964794577219, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 2.16)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: right, reward: 0.274662034247
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'right'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 0.2746620342465309, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'right')
Agent drove right instead of left. (rewarded 0.27)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: right, reward: 2.76803426892
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 2.7680342689171757, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.77)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: right, reward: 2.20237222294
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 2.2023722229445655, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.20)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: right, reward: 2.29905197021
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 2.2990519702076653, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.30)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: left, reward: 1.24438867612
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 1.2443886761247867, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.24)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 330
\-------------------------
Environment.reset(): Trial set up with start = (4, 2), destination = (1, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0373; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 2.10830327569
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.1083032756869002, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.11)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 2.47278272552
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.472782725524561, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.47)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 2.6637488889
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.6637488889008036, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.66)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: forward, reward: 0.986170128659
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 0.9861701286586013, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.99)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 2.86309882342
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.8630988234229626, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.86)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: forward, reward: 1.54537735509
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.545377355091473, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent followed the waypoint forward. (rewarded 1.55)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 2.18104040783
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 2.181040407833625, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.18)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 1.62221152593
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.62221152592766, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.62)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 1.91914958841
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.9191495884144163, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.92)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: left, reward: 2.64275013319
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 11, 't': 9, 'action': 'left', 'reward': 2.6427501331883962, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.64)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 331
\-------------------------
Environment.reset(): Trial set up with start = (2, 7), destination = (6, 5), deadline = 30
Simulating trial. . .
epsilon = 0.0369; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: forward, reward: 2.59743630591
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 30, 't': 0, 'action': 'forward', 'reward': 2.597436305912124, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 2.60)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: None, reward: 2.87060201224
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.8706020122371325, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.87)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (-1, 0), action: forward, reward: -9.29139134089
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 28, 't': 2, 'action': 'forward', 'reward': -9.291391340889046, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent attempted driving forward through a red light. (rewarded -9.29)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: forward, reward: 2.21898602318
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 27, 't': 3, 'action': 'forward', 'reward': 2.2189860231846352, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.22)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 1.67418836002
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 1.6741883600228162, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.67)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.4219663704
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 25, 't': 5, 'action': None, 'reward': 2.4219663704036765, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.42)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.48752235921
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.4875223592055917, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.49)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 2.23770247816
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 2.2377024781644046, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent followed the waypoint forward. (rewarded 2.24)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: right, reward: 1.28702557093
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 22, 't': 8, 'action': 'right', 'reward': 1.2870255709275569, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.29)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: None, reward: 2.14531127719
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 9, 'action': None, 'reward': 2.1453112771864316, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.15)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: forward, reward: 1.49974601639
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 1.4997460163855727, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.50)
63% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 332
\-------------------------
Environment.reset(): Trial set up with start = (2, 5), destination = (4, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0365; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: right, reward: 1.19781153228
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.1978115322776823, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'left')
Agent followed the waypoint right. (rewarded 1.20)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: 2.48888425839
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 2.4888842583864124, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.49)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 4), heading: (0, -1), action: left, reward: 2.32429876476
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': 2.3242987647595745, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.32)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 3), heading: (0, -1), action: forward, reward: 2.69947296775
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.6994729677512694, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.70)
80% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 333
\-------------------------
Environment.reset(): Trial set up with start = (5, 6), destination = (3, 4), deadline = 20
Simulating trial. . .
epsilon = 0.0362; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: None, reward: 1.21353980916
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.2135398091602885, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.21)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: None, reward: 0.994699390805
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 0.994699390804602, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.99)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: None, reward: 2.11888032817
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.1188803281680677, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.12)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: None, reward: 2.34449898861
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.344498988613402, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.34)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: None, reward: 2.84912033737
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.8491203373662586, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.85)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: None, reward: 1.52630883495
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.5263088349454828, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 1.53)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: left, reward: 2.32855605783
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 2.328556057831035, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.33)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: forward, reward: 2.76463364862
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.7646336486195677, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.76)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 5), heading: (0, -1), action: right, reward: 2.37126534351
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 2.371265343509906, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.37)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: forward, reward: 0.906569991341
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 0.9065699913414447, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 0.91)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 334
\-------------------------
Environment.reset(): Trial set up with start = (5, 5), destination = (7, 2), deadline = 25
Simulating trial. . .
epsilon = 0.0358; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: forward, reward: 2.70219588581
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 2.7021958858122854, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.70)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: None, reward: 2.47633361379
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.476333613791141, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.48)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: None, reward: 2.69765878617
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.697658786168173, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.70)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: forward, reward: 2.49425777984
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.49425777984397, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.49)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: right, reward: 1.79189883297
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 1.791898832972281, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.79)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: forward, reward: 1.8157211819
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.8157211818956087, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.82)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: None, reward: 2.53352496714
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.5335249671376237, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.53)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: forward, reward: 1.5164436738
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.516443673796964, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.52)
68% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 335
\-------------------------
Environment.reset(): Trial set up with start = (2, 2), destination = (5, 4), deadline = 25
Simulating trial. . .
epsilon = 0.0354; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: None, reward: 1.91206330262
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.9120633026201286, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.91)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: None, reward: 1.71912342788
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.7191234278775311, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.72)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: None, reward: 2.59761844887
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.597618448869432, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.60)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: forward, reward: 2.06584140503
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.0658414050274208, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.07)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: forward, reward: 1.42187301328
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.4218730132761828, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.42)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: 1.23023668116
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.2302366811581429, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent followed the waypoint forward. (rewarded 1.23)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: right, reward: 2.03511835955
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 2.035118359547416, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.04)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: forward, reward: 2.78883530989
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.7888353098905965, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.79)
68% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 336
\-------------------------
Environment.reset(): Trial set up with start = (1, 3), destination = (4, 4), deadline = 20
Simulating trial. . .
epsilon = 0.0351; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: forward, reward: 1.63369692828
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.6336969282807245, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 1.63)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: None, reward: 1.14476642212
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.144766422120243, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.14)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: None, reward: 1.13740941416
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.137409414158158, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 1.14)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: left, reward: 0.9879816727
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 0.9879816726998363, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 0.99)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: forward, reward: 2.45828864562
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.458288645615785, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.46)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 0.95065962805
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 15, 't': 5, 'action': None, 'reward': 0.9506596280498683, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 0.95)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 1.57335811569
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.5733581156861909, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.57)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 2.61469403548
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': 2.614694035475151, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.61)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: right, reward: 0.918023779694
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'right'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 0.9180237796941101, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'right')
Agent drove right instead of forward. (rewarded 0.92)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: None, reward: 0.961557061224
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 11, 't': 9, 'action': None, 'reward': 0.9615570612239728, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.96)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: left, reward: 2.0374610115
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 10, 't': 10, 'action': 'left', 'reward': 2.0374610114983884, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 2.04)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 4), heading: (0, -1), action: left, reward: 1.06804107373
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 9, 't': 11, 'action': 'left', 'reward': 1.0680410737337764, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.07)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 337
\-------------------------
Environment.reset(): Trial set up with start = (2, 5), destination = (6, 4), deadline = 25
Simulating trial. . .
epsilon = 0.0347; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: forward, reward: 2.22548654055
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 2.2254865405465134, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.23)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 2.71209739166
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.71209739166019, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.71)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 2.60573608829
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.6057360882918896, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.61)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 1.30171621938
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.3017162193820224, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.30)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 1.59352606933
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.5935260693343116, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.59)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: None, reward: 1.35640090779
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.3564009077887438, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.36)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: forward, reward: 1.59721010687
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 1.5972101068657298, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 1.60)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: right, reward: 1.96915909591
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 1.9691590959127578, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.97)
68% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 338
\-------------------------
Environment.reset(): Trial set up with start = (5, 6), destination = (1, 7), deadline = 25
Simulating trial. . .
epsilon = 0.0344; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: None, reward: 1.83996173901
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.83996173901298, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.84)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: None, reward: 1.78980105118
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.7898010511820297, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.79)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: None, reward: 1.21191756576
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.2119175657634933, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.21)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: None, reward: 1.6999331615
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.6999331614955842, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.70)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: None, reward: 1.1147180402
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.1147180402016261, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.11)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: None, reward: 1.13025022012
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.130250220117858, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.13)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: left, reward: 0.937757922177
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 19, 't': 6, 'action': 'left', 'reward': 0.9377579221771009, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 0.94)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: forward, reward: 1.7776865854
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.7776865853978485, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.78)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: None, reward: 2.47133569162
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.4713356916194487, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.47)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: None, reward: 1.0020221158
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.002022115796526, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.00)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: None, reward: 1.5453048137
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.5453048137012564, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.55)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: forward, reward: 1.50272847829
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 1.5027284782880341, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.50)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: forward, reward: 2.40600050475
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 13, 't': 12, 'action': 'forward', 'reward': 2.406000504746257, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.41)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 2.04051967861
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 12, 't': 13, 'action': 'right', 'reward': 2.040519678608344, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 2.04)
44% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 339
\-------------------------
Environment.reset(): Trial set up with start = (3, 5), destination = (8, 3), deadline = 25
Simulating trial. . .
epsilon = 0.0340; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 2.1732313036
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 25, 't': 0, 'action': None, 'reward': 2.1732313036038704, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.17)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 2.06329745728
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.0632974572849303, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.06)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 2.14095418294
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.14095418294448, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.14)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: None, reward: 2.65247485464
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.652474854637213, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.65)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: right, reward: 0.969090287125
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 0.9690902871252344, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.97)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: right, reward: 1.73540117699
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 1.7354011769922022, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.74)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 1.69584288453
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.6958428845310927, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.70)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 1.9091779002
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 7, 'action': None, 'reward': 1.9091779001965807, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.91)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 2.14754190286
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.147541902860992, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.15)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: forward, reward: 2.49429089915
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 2.494290899146887, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.49)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: 1.87879038877
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.8787903887709505, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.88)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: 1.57701307816
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 11, 'action': None, 'reward': 1.5770130781630245, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.58)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: 2.57339410455
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 12, 'action': None, 'reward': 2.5733941045546187, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.57)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: left, reward: -0.126655237229
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 12, 't': 13, 'action': 'left', 'reward': -0.12665523722906902, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded -0.13)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 2.09245548016
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 11, 't': 14, 'action': 'right', 'reward': 2.0924554801570077, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.09)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 0.839042479113
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 10, 't': 15, 'action': None, 'reward': 0.8390424791130853, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.84)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: left, reward: 2.14587096284
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 9, 't': 16, 'action': 'left', 'reward': 2.1458709628402985, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.15)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: forward, reward: 1.81343356226
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 8, 't': 17, 'action': 'forward', 'reward': 1.8134335622563653, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.81)
28% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 340
\-------------------------
Environment.reset(): Trial set up with start = (4, 2), destination = (8, 7), deadline = 25
Simulating trial. . .
epsilon = 0.0337; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 1.92992961335
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.9299296133463655, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.93)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 2.29653471475
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.296534714748656, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.30)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 2), heading: (-1, 0), action: None, reward: 2.71843653225
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.71843653224545, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.72)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: forward, reward: 1.05991179937
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.059911799368384, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.06)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 2.72095761126
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.72095761126172, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.72)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: forward, reward: 1.26898973574
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.2689897357431243, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.27)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 1.60481223113
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 1.6048122311322992, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.60)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 2.230760277
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 18, 't': 7, 'action': None, 'reward': 2.2307602769992174, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.23)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 2.74176245911
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.741762459107944, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.74)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 2.81178512757
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 2.811785127574537, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.81)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: right, reward: 1.60991703032
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 1.609917030319155, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.61)
56% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 341
\-------------------------
Environment.reset(): Trial set up with start = (8, 4), destination = (6, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0334; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: None, reward: 1.64260346807
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.6426034680740882, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.64)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: right, reward: 1.28551347002
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.2855134700226558, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.29)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 1.11262317014
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.112623170140726, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.11)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 1.05231872662
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.0523187266218634, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.05)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: forward, reward: 2.82238166426
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.8223816642572235, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 2.82)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 1.04224007438
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.0422400743810234, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.04)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: right, reward: 2.76431629672
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 2.764316296724285, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.76)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: forward, reward: 1.01312068024
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.0131206802404567, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.01)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 342
\-------------------------
Environment.reset(): Trial set up with start = (2, 2), destination = (5, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0330; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 1.10971361038
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.1097136103842142, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.11)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 1.8230894179
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.8230894179016792, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.82)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 1.7893818001
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.7893818001012305, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.79)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: left, reward: 2.30633909009
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'left', 'reward': 2.306339090089387, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.31)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: forward, reward: 2.51589272151
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.515892721505093, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.52)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: None, reward: 1.36392839974
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.3639283997414098, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.36)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: None, reward: 2.36883222486
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.368832224864022, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.37)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: 2.82209283534
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.82209283533608, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.82)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: left, reward: 2.29602147938
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'left', 'reward': 2.296021479381464, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.30)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: forward, reward: 2.50803043292
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 2.5080304329230216, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.51)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 343
\-------------------------
Environment.reset(): Trial set up with start = (8, 4), destination = (5, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0327; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: right, reward: 1.34066932278
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.3406693227754314, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 1.34)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: right, reward: 1.47605279684
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.4760527968425305, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.48)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: right, reward: 1.22418391921
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.2241839192060244, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.22)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: None, reward: 1.76801611682
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.768016116821567, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.77)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: right, reward: 1.2447438626
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 1.2447438625983764, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent drove right instead of forward. (rewarded 1.24)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: 1.69217530081
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.6921753008056135, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.69)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: 2.8352040413
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.835204041300017, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.84)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: left, reward: 2.07472122959
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 2.0747212295920647, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.07)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: forward, reward: 2.63905586464
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 2.639055864642594, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.64)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: right, reward: 0.447849945347
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'right'), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 0.4478499453472726, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'right')
Agent drove right instead of forward. (rewarded 0.45)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: left, reward: 2.68389413818
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'left', 'reward': 2.6838941381827253, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.68)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: 2.44287611882
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 9, 't': 11, 'action': None, 'reward': 2.4428761188242674, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.44)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: 1.76936636337
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 8, 't': 12, 'action': None, 'reward': 1.769366363371096, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.77)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: 2.39795754232
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 7, 't': 13, 'action': None, 'reward': 2.39795754232441, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.40)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: None, reward: 1.20831519631
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 6, 't': 14, 'action': None, 'reward': 1.2083151963054009, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.21)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: forward, reward: -9.46367622816
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 5, 't': 15, 'action': 'forward', 'reward': -9.463676228159379, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent attempted driving forward through a red light. (rewarded -9.46)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: left, reward: 2.19357995607
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 4, 't': 16, 'action': 'left', 'reward': 2.193579956066387, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.19)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: forward, reward: 1.2214115562
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 3, 't': 17, 'action': 'forward', 'reward': 1.2214115562007175, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.22)
10% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 344
\-------------------------
Environment.reset(): Trial set up with start = (1, 4), destination = (4, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0324; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: None, reward: 2.11850023549
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.1185002354945466, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'left')
Agent properly idled at a red light. (rewarded 2.12)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: right, reward: 1.73858137271
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'right'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.7385813727097172, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'right')
Agent drove right instead of left. (rewarded 1.74)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: right, reward: 2.55938874698
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.5593887469848644, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 2.56)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: None, reward: 0.0753048531768
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'forward'), 'deadline': 17, 't': 3, 'action': None, 'reward': 0.07530485317684377, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 0.08)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: left, reward: -9.59169217346
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': -9.591692173456694, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.59)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: right, reward: 1.29487301118
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 1.2948730111812914, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.29)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 2.71339979653
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.71339979653004, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.71)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 0.963009333104
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 0.9630093331043508, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.96)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.24297711212
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.2429771121171393, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.24)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.02346594236
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 11, 't': 9, 'action': None, 'reward': 2.0234659423561348, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.02)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.65958623436
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 10, 't': 10, 'action': None, 'reward': 2.659586234361902, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.66)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 1.27782167993
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 1.277821679926846, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.28)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: 1.66621660251
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': 1.6662166025114098, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.67)
35% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 345
\-------------------------
Environment.reset(): Trial set up with start = (1, 4), destination = (4, 2), deadline = 25
Simulating trial. . .
epsilon = 0.0321; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: None, reward: 1.92087034216
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.9208703421570148, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.92)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: None, reward: 1.66449249832
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.664492498317182, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.66)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: None, reward: 1.88216518748
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.8821651874837138, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.88)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: left, reward: 2.06705944861
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'left', 'reward': 2.067059448612151, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.07)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 2.71400723542
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.7140072354153513, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.71)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 1.27460014878
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.2746001487829737, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.27)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 1.0865076116
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.0865076115955514, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.09)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: right, reward: 0.244211332795
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 0.24421133279538343, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent drove right instead of forward. (rewarded 0.24)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: 1.09131135285
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.0913113528507494, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.09)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: 1.64438556981
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.6443855698092746, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.64)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: left, reward: 1.21617871372
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'left', 'reward': 1.2161787137204194, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.22)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: 1.21335057318
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 1.2133505731822436, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.21)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: left, reward: -19.8063773373
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'right'}, 'violation': 3, 'light': 'green', 'state': ('right', 'green', 'right', 'right'), 'deadline': 13, 't': 12, 'action': 'left', 'reward': -19.806377337277002, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'right')
Agent attempted driving left through traffic and cause a minor accident. (rewarded -19.81)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: right, reward: 2.27496513692
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 12, 't': 13, 'action': 'right', 'reward': 2.2749651369154993, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.27)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: None, reward: 2.60259565138
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 11, 't': 14, 'action': None, 'reward': 2.6025956513816393, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.60)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: None, reward: 1.14670471574
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 10, 't': 15, 'action': None, 'reward': 1.1467047157448758, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.15)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: forward, reward: 1.73903459503
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 16, 'action': 'forward', 'reward': 1.739034595026781, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.74)
32% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: 2.19652645186
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 8, 't': 17, 'action': None, 'reward': 2.1965264518561347, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.20)
28% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: None, reward: 0.546356309903
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 7, 't': 18, 'action': None, 'reward': 0.5463563099027546, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 0.55)
24% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: forward, reward: 0.694149171424
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 6, 't': 19, 'action': 'forward', 'reward': 0.6941491714238457, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 0.69)
20% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 346
\-------------------------
Environment.reset(): Trial set up with start = (5, 2), destination = (1, 6), deadline = 30
Simulating trial. . .
epsilon = 0.0317; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: right, reward: 1.82462542203
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 1.82462542202619, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.82)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: right, reward: 1.46983986484
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 29, 't': 1, 'action': 'right', 'reward': 1.4698398648360764, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.47)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: None, reward: 1.23541901097
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.2354190109680931, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.24)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: forward, reward: 2.33545018289
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 27, 't': 3, 'action': 'forward', 'reward': 2.3354501828854968, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.34)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: None, reward: 1.13951427225
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 26, 't': 4, 'action': None, 'reward': 1.1395142722468017, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.14)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: None, reward: 2.25341925492
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 25, 't': 5, 'action': None, 'reward': 2.2534192549230276, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.25)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: None, reward: 2.42241452863
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.422414528627903, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.42)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: forward, reward: 1.41052228991
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 1.4105222899073073, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.41)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 2.7185729161
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.7185729160969445, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 2.72)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 1.01960868695
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 21, 't': 9, 'action': None, 'reward': 1.0196086869495224, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.02)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: 1.63260623374
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 1.632606233742428, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.63)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 1.3154853265
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 19, 't': 11, 'action': None, 'reward': 1.3154853265006905, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.32)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 2.80005301623
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 18, 't': 12, 'action': None, 'reward': 2.8000530162250667, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.80)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: right, reward: 1.70026769599
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 17, 't': 13, 'action': 'right', 'reward': 1.700267695985506, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 1.70)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: right, reward: 1.04798356535
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 16, 't': 14, 'action': 'right', 'reward': 1.0479835653525182, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.05)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: right, reward: 1.53624692295
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 15, 't': 15, 'action': 'right', 'reward': 1.536246922950731, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.54)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: right, reward: 2.6286970361
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 14, 't': 16, 'action': 'right', 'reward': 2.6286970361004944, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 2.63)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: left, reward: 2.35435893898
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 13, 't': 17, 'action': 'left', 'reward': 2.3543589389760573, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.35)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 347
\-------------------------
Environment.reset(): Trial set up with start = (3, 2), destination = (7, 7), deadline = 25
Simulating trial. . .
epsilon = 0.0314; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: right, reward: 2.16343927636
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.1634392763558323, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.16)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 2.23661285133
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.2366128513262717, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.24)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 2.54129930081
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.5412993008116778, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.54)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 2.34643961814
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.3464396181395193, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.35)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 2.06414973654
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.0641497365435155, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.06)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: 0.972093259524
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 0.9720932595236371, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.97)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: right, reward: 1.63935163625
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': 'right', 'reward': 1.6393516362509577, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.64)
72% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 348
\-------------------------
Environment.reset(): Trial set up with start = (1, 2), destination = (6, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0311; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: None, reward: 1.48494342471
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.4849434247145745, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.48)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: None, reward: 2.63118248526
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.631182485258892, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.63)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (0, -1), action: None, reward: 2.90507174551
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.905071745513811, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.91)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: left, reward: 2.85611547279
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 2.8561154727871108, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.86)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: 2.10823388651
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.10823388651319, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.11)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: forward, reward: 1.05631570989
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.0563157098890827, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.06)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: left, reward: 2.86310177409
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 2.8631017740853144, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.86)
65% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 349
\-------------------------
Environment.reset(): Trial set up with start = (1, 2), destination = (4, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0308; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: left, reward: 2.17856004938
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 2.178560049377509, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.18)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: None, reward: 2.35869400305
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.3586940030475576, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.36)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: None, reward: 2.78733544403
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.787335444027782, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.79)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: forward, reward: 2.35825691946
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.358256919463849, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.36)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 1.45990193821
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.4599019382064051, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.46)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: forward, reward: 2.47610003973
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.4761000397289425, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.48)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: right, reward: 0.966428115185
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 0.9664281151854215, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 0.97)
65% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 350
\-------------------------
Environment.reset(): Trial set up with start = (3, 2), destination = (6, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0305; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: right, reward: 1.15469225288
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.1546922528829455, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.15)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: None, reward: 2.74914810959
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.7491481095894947, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.75)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: None, reward: 2.96656410411
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.966564104108871, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.97)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: 2.15876568277
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.1587656827662753, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.16)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: forward, reward: 1.24068105225
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.240681052249799, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.24)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: left, reward: 2.18363984763
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 2.1836398476277967, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.18)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: None, reward: 2.43276616444
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.4327661644411274, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.43)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: forward, reward: 1.53194431143
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.5319443114270588, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.53)
68% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 351
\-------------------------
Environment.reset(): Trial set up with start = (4, 4), destination = (6, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0302; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: left, reward: 2.20453626904
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 2.2045362690439827, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.20)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: None, reward: 2.09658284756
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.096582847562148, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.10)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: None, reward: 2.86172890093
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.8617289009265106, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.86)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: None, reward: 1.75686141716
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.7568614171622958, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.76)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: None, reward: 2.49685467182
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.4968546718150018, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.50)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 4), heading: (1, 0), action: None, reward: 2.90897295286
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.9089729528636408, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.91)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: forward, reward: 1.76556074053
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.7655607405293443, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 1.77)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 5), heading: (0, 1), action: right, reward: 1.57378298672
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 1.573782986722209, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 1.57)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: right, reward: -0.0537524182438
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': -0.05375241824384569, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent drove right instead of forward. (rewarded -0.05)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: None, reward: 1.38892811629
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.3889281162908866, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.39)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: None, reward: 2.42618385863
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 10, 't': 10, 'action': None, 'reward': 2.426183858626415, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.43)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: right, reward: 0.228347969152
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 9, 't': 11, 'action': 'right', 'reward': 0.22834796915185573, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.23)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: left, reward: 0.432829145039
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 8, 't': 12, 'action': 'left', 'reward': 0.432829145039324, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 0.43)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: left, reward: -39.0980138981
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'right', 'right'), 'deadline': 7, 't': 13, 'action': 'left', 'reward': -39.098013898055406, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'right')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -39.10)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 4), heading: (-1, 0), action: None, reward: 0.230027296872
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', 'right'), 'deadline': 6, 't': 14, 'action': None, 'reward': 0.23002729687182633, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'right')
Agent properly idled at a red light. (rewarded 0.23)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 3), heading: (0, -1), action: right, reward: 2.18632067593
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 5, 't': 15, 'action': 'right', 'reward': 2.186320675933737, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.19)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: right, reward: 1.04988058458
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 4, 't': 16, 'action': 'right', 'reward': 1.0498805845803176, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.05)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: None, reward: 1.22098483618
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 3, 't': 17, 'action': None, 'reward': 1.2209848361780158, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.22)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: None, reward: 1.8711953121
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 2, 't': 18, 'action': None, 'reward': 1.8711953121037157, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.87)
5% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: None, reward: 0.872639854138
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 1, 't': 19, 'action': None, 'reward': 0.8726398541377194, 'waypoint': 'forward'}
Environment.step(): Primary agent ran out of time! Trial aborted.
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 0.87)
0% of time remaining to reach destination.
Trial Aborted!
Agent did not reach the destination.
/-------------------------
| Training trial 352
\-------------------------
Environment.reset(): Trial set up with start = (1, 3), destination = (4, 4), deadline = 20
Simulating trial. . .
epsilon = 0.0299; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: right, reward: 1.81379415228
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.8137941522829801, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.81)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.63625124815
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.6362512481460545, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.64)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.32064750974
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.3206475097353163, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.32)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.19906436706
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.1990643670622882, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.20)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.52403832051
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.5240383205128936, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.52)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.03321971801
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.0332197180141742, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.03)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 1.51780238827
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.5178023882686542, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.52)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: 1.48921733388
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.4892173338777182, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 1.49)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: -9.02922192621
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': -9.02922192621492, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent attempted driving forward through a red light. (rewarded -9.03)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: right, reward: 2.60805135542
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 2.608051355419299, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 2.61)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 353
\-------------------------
Environment.reset(): Trial set up with start = (1, 4), destination = (3, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0296; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: forward, reward: -39.1754447382
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 4, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': -39.17544473816673, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent attempted driving forward through a red light with traffic and cause a major accident. (rewarded -39.18)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: right, reward: 2.23410574364
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 2.2341057436359684, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.23)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: forward, reward: 1.58979956726
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 1.5897995672567955, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.59)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: right, reward: 2.09232177275
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 2.092321772751627, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.09)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: None, reward: 1.05430259034
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.0543025903361132, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.05)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: None, reward: 2.71828521898
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.718285218978613, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.72)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 5), heading: (-1, 0), action: right, reward: 0.872091790378
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 0.8720917903777536, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent drove right instead of forward. (rewarded 0.87)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: right, reward: 0.727556081175
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 0.7275560811746482, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.73)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: right, reward: 1.79095168686
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.7909516868614006, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.79)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: right, reward: 2.32317398961
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 2.3231739896087857, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.32)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: None, reward: 1.14204788949
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.142047889485058, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.14)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: None, reward: 2.48160919505
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 9, 't': 11, 'action': None, 'reward': 2.4816091950539523, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.48)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 5), heading: (0, 1), action: None, reward: 2.43531566481
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 8, 't': 12, 'action': None, 'reward': 2.4353156648113083, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.44)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: left, reward: 1.06308842735
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 7, 't': 13, 'action': 'left', 'reward': 1.0630884273535184, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 1.06)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: right, reward: 1.09379085795
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 6, 't': 14, 'action': 'right', 'reward': 1.0937908579468145, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.09)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: right, reward: 2.01988821239
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'left'), 'deadline': 5, 't': 15, 'action': 'right', 'reward': 2.019888212389147, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'left')
Agent followed the waypoint right. (rewarded 2.02)
20% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 354
\-------------------------
Environment.reset(): Trial set up with start = (1, 4), destination = (6, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0293; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: forward, reward: 1.65511438761
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.655114387613305, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 1.66)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: None, reward: 1.90446889305
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.9044688930529137, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.90)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: None, reward: 2.96532590319
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.965325903194441, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.97)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: left, reward: 2.58721286325
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 2.5872128632545133, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 2.59)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: forward, reward: 1.34047705859
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.3404770585924488, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.34)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: None, reward: 1.03498215032
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.0349821503211096, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.03)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: None, reward: 1.98074492114
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.9807449211396306, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.98)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: forward, reward: 2.3615162445
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.361516244501706, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.36)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: None, reward: 1.56179969367
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.5617996936717147, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.56)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: right, reward: 0.00910697349634
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 0.009106973496343151, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 0.01)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: forward, reward: 2.24807339423
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 2.2480733942316675, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.25)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: None, reward: 2.41570664561
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 9, 't': 11, 'action': None, 'reward': 2.4157066456072727, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.42)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: None, reward: 0.854507028714
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 8, 't': 12, 'action': None, 'reward': 0.8545070287143461, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 0.85)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: forward, reward: 1.90224365428
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 1.902243654282366, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.90)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: forward, reward: 1.48901560482
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 6, 't': 14, 'action': 'forward', 'reward': 1.4890156048209668, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.49)
25% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 355
\-------------------------
Environment.reset(): Trial set up with start = (8, 5), destination = (6, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0290; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: right, reward: 2.41786887351
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.417868873505965, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.42)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 0.429488842206
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', 'right'), 'deadline': 19, 't': 1, 'action': None, 'reward': 0.42948884220606354, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'right')
Agent properly idled at a red light. (rewarded 0.43)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: left, reward: -10.2722205098
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': -10.272220509821492, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.27)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (0, 1), action: None, reward: 1.04468925176
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.0446892517634847, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.04)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: right, reward: 2.79666077441
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 2.7966607744146943, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.80)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 1.61261434218
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.6126143421826673, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.61)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: forward, reward: 1.19717662572
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.1971766257208556, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.20)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: None, reward: 1.72690730296
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.7269073029635116, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.73)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: None, reward: 2.36827094151
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.3682709415060965, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.37)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: None, reward: 0.989980331003
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 11, 't': 9, 'action': None, 'reward': 0.9899803310031419, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.99)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: left, reward: 1.26387105516
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 10, 't': 10, 'action': 'left', 'reward': 1.2638710551599674, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.26)
45% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 356
\-------------------------
Environment.reset(): Trial set up with start = (8, 4), destination = (2, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0287; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: left, reward: -40.886959871
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 4, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': -40.886959870998645, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent attempted driving left through a red light with traffic and cause a major accident. (rewarded -40.89)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: None, reward: 2.37346131931
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.373461319310233, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.37)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: None, reward: 2.5468097234
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.5468097234022595, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.55)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: None, reward: 2.28585430749
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.2858543074878948, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.29)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: left, reward: 2.91464681096
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.914646810962571, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 2.91)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: None, reward: 1.58600775112
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.5860077511221153, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.59)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: None, reward: 1.14649238926
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.146492389264045, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.15)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: forward, reward: 1.93994294514
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.9399429451442471, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.94)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: right, reward: 1.76409685198
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.764096851978228, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.76)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: None, reward: 1.29881679159
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.2988167915939495, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.30)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: forward, reward: 1.0650163095
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 1.065016309499126, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.07)
45% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 357
\-------------------------
Environment.reset(): Trial set up with start = (6, 7), destination = (2, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0284; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: None, reward: 2.66530544851
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 25, 't': 0, 'action': None, 'reward': 2.6653054485117598, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.67)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: right, reward: 0.918207428437
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 0.9182074284369965, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent drove right instead of left. (rewarded 0.92)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: forward, reward: 1.58641105095
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': 1.5864110509488092, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.59)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 7), heading: (-1, 0), action: forward, reward: 2.19557783187
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.1955778318655614, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.20)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: forward, reward: 2.54097583229
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.540975832293028, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 2.54)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: right, reward: 2.66966962852
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'right', 'reward': 2.669669628521148, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.67)
76% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 358
\-------------------------
Environment.reset(): Trial set up with start = (2, 5), destination = (7, 2), deadline = 30
Simulating trial. . .
epsilon = 0.0282; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 6), heading: (0, 1), action: right, reward: 1.90711229308
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'forward'), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 1.9071122930810696, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'forward')
Agent followed the waypoint right. (rewarded 1.91)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: right, reward: 1.99765029968
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 29, 't': 1, 'action': 'right', 'reward': 1.9976502996784158, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.00)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: 1.35421795198
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 28, 't': 2, 'action': 'forward', 'reward': 1.354217951978155, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.35)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: 1.87729274379
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.8772927437850693, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.88)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: forward, reward: 1.96026115718
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 1.9602611571794346, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.96)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: left, reward: 1.60831914817
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 25, 't': 5, 'action': 'left', 'reward': 1.6083191481749342, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.61)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: forward, reward: 1.12460068595
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.1246006859537208, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.12)
77% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 359
\-------------------------
Environment.reset(): Trial set up with start = (6, 4), destination = (5, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0279; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 1.19073173062
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.1907317306210783, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.19)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 2.94738587962
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.947385879623994, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.95)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 1.80329203929
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.8032920392891403, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.80)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: forward, reward: 1.61428478183
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.6142847818256116, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 1.61)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: right, reward: 2.83752691693
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 2.837526916928044, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.84)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: forward, reward: 1.69796919204
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.6979691920406992, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.70)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: forward, reward: 2.82723415821
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 2.8272341582072986, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 2.83)
65% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 360
\-------------------------
Environment.reset(): Trial set up with start = (5, 6), destination = (1, 5), deadline = 25
Simulating trial. . .
epsilon = 0.0276; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: left, reward: 0.827003540303
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 0.8270035403031164, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 0.83)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: left, reward: 1.70694343232
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 24, 't': 1, 'action': 'left', 'reward': 1.706943432316282, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.71)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: forward, reward: 2.10987408451
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': 2.109874084508691, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.11)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: left, reward: 1.58911391177
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'left', 'reward': 1.589113911769796, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.59)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: right, reward: 2.21879122636
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 2.218791226358144, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.22)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: forward, reward: 2.80896042243
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 2.8089604224290294, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.81)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 1.48010625662
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.4801062566214993, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.48)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 0.3821752386
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'right'), 'deadline': 18, 't': 7, 'action': 'right', 'reward': 0.382175238600466, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'right')
Agent drove right instead of left. (rewarded 0.38)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: 1.13924336627
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.1392433662746022, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.14)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 2.28131448661
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 2.281314486607156, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.28)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 6), heading: (0, -1), action: right, reward: 1.3084094854
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 1.3084094853988184, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.31)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: right, reward: 2.51099385082
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 2.510993850818037, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.51)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 0.978838878464
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 13, 't': 12, 'action': None, 'reward': 0.9788388784643216, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 0.98)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: left, reward: 1.2550483738
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'left', 'reward': 1.2550483738037628, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.26)
44% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 361
\-------------------------
Environment.reset(): Trial set up with start = (4, 2), destination = (6, 4), deadline = 20
Simulating trial. . .
epsilon = 0.0273; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: right, reward: 1.68636249921
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.6863624992054123, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 1.69)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 1.26068455569
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.2606845556914321, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.26)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 2.51991486105
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.51991486104866, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.52)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 2.25683796846
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.2568379684586346, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.26)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 2), heading: (-1, 0), action: None, reward: 2.50157074466
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.501570744658001, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.50)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: left, reward: 2.75380604362
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 2.7538060436210774, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.75)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: left, reward: 2.07910117824
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 2.0791011782449904, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.08)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: forward, reward: 1.28119038051
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.281190380505935, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.28)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: None, reward: 1.46894176396
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.4689417639615436, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.47)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: None, reward: 1.81924522734
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.8192452273381854, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.82)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 3), heading: (1, 0), action: None, reward: 2.49925134682
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 10, 't': 10, 'action': None, 'reward': 2.4992513468207744, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.50)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: forward, reward: 2.06651526975
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 2.0665152697478186, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.07)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 4), heading: (0, 1), action: right, reward: 2.12212009074
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 2.122120090735522, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent followed the waypoint right. (rewarded 2.12)
35% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 362
\-------------------------
Environment.reset(): Trial set up with start = (4, 5), destination = (6, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0271; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: left, reward: 1.16433435263
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.1643343526283376, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.16)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: forward, reward: 2.1030608593
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 2.1030608592999065, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.10)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: None, reward: 1.16949413937
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'forward'), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.1694941393744378, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.17)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: right, reward: 0.285147758775
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 0.2851477587746075, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.29)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: forward, reward: 2.15176189313
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.1517618931308786, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.15)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: None, reward: 1.02634393389
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.0263439338889215, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 1.03)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: None, reward: 2.0727590406
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.072759040598445, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.07)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: forward, reward: 1.7247308047
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.7247308046980154, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.72)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 3), heading: (0, 1), action: forward, reward: 1.25811832982
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 1.258118329822763, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.26)
55% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 363
\-------------------------
Environment.reset(): Trial set up with start = (3, 7), destination = (6, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0268; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 2.93318770189
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.93318770189268, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.93)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 1.1740745672
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.1740745672018629, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.17)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 2.4627936427
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.4627936426992547, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.46)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: forward, reward: 1.58714199616
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.5871419961587783, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.59)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: forward, reward: 1.78094666372
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.7809466637223255, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.78)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: forward, reward: 1.55354994231
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.5535499423096153, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.55)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: left, reward: 1.18759172538
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 1.1875917253765773, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.19)
65% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 364
\-------------------------
Environment.reset(): Trial set up with start = (8, 4), destination = (5, 2), deadline = 25
Simulating trial. . .
epsilon = 0.0265; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: forward, reward: 1.69060887416
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 1.6906088741576952, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.69)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 1.7784057159
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.778405715896739, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.78)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 2.46201451893
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.4620145189337395, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.46)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 1.76243125103
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.7624312510269233, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.76)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: forward, reward: 1.10335698926
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.1033569892555475, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.10)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 2.46045507997
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.4604550799705844, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.46)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: 1.35546559121
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.3554655912052143, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.36)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: None, reward: -4.15842677349
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': None, 'reward': -4.158426773489037, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.16)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 4), heading: (-1, 0), action: forward, reward: 1.75154477899
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 1.7515447789864533, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.75)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: right, reward: 2.8057010934
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 16, 't': 9, 'action': 'right', 'reward': 2.805701093402037, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.81)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 2), heading: (0, -1), action: forward, reward: 2.49707159485
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 2.497071594849637, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.50)
56% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 365
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (7, 3), deadline = 25
Simulating trial. . .
epsilon = 0.0263; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: left, reward: 1.32848828213
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 1.3284882821266175, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.33)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: forward, reward: 2.96008680387
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 2.960086803868819, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.96)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: left, reward: 1.65255818946
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 23, 't': 2, 'action': 'left', 'reward': 1.652558189463625, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.65)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: forward, reward: 2.19469117863
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.1946911786274383, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.19)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: forward, reward: 1.74134500431
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.741345004307004, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.74)
80% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 366
\-------------------------
Environment.reset(): Trial set up with start = (5, 4), destination = (7, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0260; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 4), heading: (1, 0), action: right, reward: 1.00287947954
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.0028794795383318, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.00)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: forward, reward: 1.7391218687
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 1.739121868697509, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 1.74)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 5), heading: (0, 1), action: right, reward: 1.26116261317
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 1.2611626131731815, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.26)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 5), heading: (0, 1), action: None, reward: 2.93975139726
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.9397513972603484, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.94)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 5), heading: (0, 1), action: None, reward: 1.61250363483
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.6125036348252166, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.61)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 5), heading: (0, 1), action: None, reward: 2.68745281762
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.687452817623181, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.69)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: forward, reward: 1.51877272191
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 1.518772721905203, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.52)
65% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 367
\-------------------------
Environment.reset(): Trial set up with start = (2, 6), destination = (5, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0257; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: left, reward: 1.11637364191
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.1163736419085617, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 1.12)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: None, reward: 2.74755808069
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.7475580806860904, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.75)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: None, reward: 2.76323523313
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.763235233126875, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.76)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: None, reward: 1.06954033809
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.069540338091357, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.07)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: left, reward: 2.79790152959
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.797901529588424, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.80)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: forward, reward: 0.950886881004
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 0.9508868810043205, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.95)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: forward, reward: 2.18093237261
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 2.180932372611693, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.18)
65% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 368
\-------------------------
Environment.reset(): Trial set up with start = (8, 3), destination = (4, 6), deadline = 35
Simulating trial. . .
epsilon = 0.0255; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: right, reward: 2.52902259057
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 35, 't': 0, 'action': 'right', 'reward': 2.5290225905743506, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 2.53)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: right, reward: 2.72968132713
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 34, 't': 1, 'action': 'right', 'reward': 2.7296813271273086, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 2.73)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: forward, reward: 1.5445413038
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 33, 't': 2, 'action': 'forward', 'reward': 1.5445413038009106, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.54)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: None, reward: 2.03464502155
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 32, 't': 3, 'action': None, 'reward': 2.034645021553512, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.03)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: None, reward: 1.36925085529
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 31, 't': 4, 'action': None, 'reward': 1.3692508552878853, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.37)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: forward, reward: 1.12305104186
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 30, 't': 5, 'action': 'forward', 'reward': 1.1230510418570783, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.12)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 1.20123973567
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 29, 't': 6, 'action': None, 'reward': 1.201239735670052, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.20)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 1.88363364721
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 28, 't': 7, 'action': None, 'reward': 1.8836336472099289, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.88)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: None, reward: 2.16990546237
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 27, 't': 8, 'action': None, 'reward': 2.1699054623705827, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.17)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: forward, reward: 1.92302791372
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 26, 't': 9, 'action': 'forward', 'reward': 1.9230279137205188, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.92)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: None, reward: 2.79260729904
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 25, 't': 10, 'action': None, 'reward': 2.7926072990379915, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.79)
69% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: right, reward: 0.128906933403
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'left'), 'deadline': 24, 't': 11, 'action': 'right', 'reward': 0.12890693340300252, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'left')
Agent drove right instead of left. (rewarded 0.13)
66% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: right, reward: 1.51452798512
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 23, 't': 12, 'action': 'right', 'reward': 1.5145279851191915, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.51)
63% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: right, reward: 2.79899705589
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 22, 't': 13, 'action': 'right', 'reward': 2.798997055886784, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.80)
60% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: right, reward: 1.20873758507
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 21, 't': 14, 'action': 'right', 'reward': 1.2087375850699835, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent followed the waypoint right. (rewarded 1.21)
57% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: right, reward: 1.03764743456
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 20, 't': 15, 'action': 'right', 'reward': 1.0376474345619497, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.04)
54% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (3, 3), heading: (-1, 0), action: right, reward: 2.2817801528
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 16, 'action': 'right', 'reward': 2.2817801528017636, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.28)
51% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: right, reward: 2.09361648136
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 18, 't': 17, 'action': 'right', 'reward': 2.0936164813611233, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.09)
49% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: right, reward: 1.57318431779
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 17, 't': 18, 'action': 'right', 'reward': 1.5731843177925247, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.57)
46% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (4, 7), heading: (0, -1), action: left, reward: 1.20814428033
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 16, 't': 19, 'action': 'left', 'reward': 1.2081442803275602, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.21)
43% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: forward, reward: 1.36704297824
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 20, 'action': 'forward', 'reward': 1.3670429782404305, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.37)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 369
\-------------------------
Environment.reset(): Trial set up with start = (5, 5), destination = (1, 2), deadline = 35
Simulating trial. . .
epsilon = 0.0252; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: left, reward: 1.21961576061
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 35, 't': 0, 'action': 'left', 'reward': 1.2196157606116171, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.22)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: None, reward: 2.00547363275
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 34, 't': 1, 'action': None, 'reward': 2.005473632747118, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.01)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: None, reward: 1.09676880896
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 33, 't': 2, 'action': None, 'reward': 1.0967688089587002, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.10)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: None, reward: 2.48480503407
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 32, 't': 3, 'action': None, 'reward': 2.4848050340740526, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.48)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: None, reward: 2.32716805176
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 31, 't': 4, 'action': None, 'reward': 2.3271680517616042, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.33)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: None, reward: 2.59881540683
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 30, 't': 5, 'action': None, 'reward': 2.598815406828436, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.60)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: right, reward: 1.71795417577
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 29, 't': 6, 'action': 'right', 'reward': 1.7179541757741206, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.72)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: forward, reward: 1.42453633173
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 28, 't': 7, 'action': 'forward', 'reward': 1.4245363317250463, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.42)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: forward, reward: 2.88670151531
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 27, 't': 8, 'action': 'forward', 'reward': 2.886701515312657, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.89)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: forward, reward: 2.53546949393
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 26, 't': 9, 'action': 'forward', 'reward': 2.5354694939319695, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.54)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: 2.33893479282
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 25, 't': 10, 'action': None, 'reward': 2.338934792823254, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.34)
69% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: 1.78240786479
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 24, 't': 11, 'action': None, 'reward': 1.7824078647880643, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.78)
66% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: None, reward: 1.72739290698
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 12, 'action': None, 'reward': 1.7273929069778258, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.73)
63% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: left, reward: 2.49232576289
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 22, 't': 13, 'action': 'left', 'reward': 2.4923257628894695, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.49)
60% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: -4.11746645457
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 14, 'action': None, 'reward': -4.11746645457279, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent idled at a green light with no oncoming traffic. (rewarded -4.12)
57% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: forward, reward: -10.4939228896
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 15, 'action': 'forward', 'reward': -10.493922889637087, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -10.49)
54% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: 0.965959854239
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 19, 't': 16, 'action': None, 'reward': 0.9659598542390209, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 0.97)
51% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: forward, reward: 2.62674466017
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 17, 'action': 'forward', 'reward': 2.626744660169889, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.63)
49% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 370
\-------------------------
Environment.reset(): Trial set up with start = (2, 6), destination = (7, 4), deadline = 25
Simulating trial. . .
epsilon = 0.0250; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: left, reward: 0.941267984689
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 0.9412679846891001, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 0.94)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: forward, reward: 0.347882089155
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'forward'), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 0.3478820891551597, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'forward')
Agent drove forward instead of right. (rewarded 0.35)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: right, reward: 1.23024585059
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.2302458505891904, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.23)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 1.97099506081
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.9709950608107576, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.97)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: None, reward: 2.27173034335
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.2717303433518676, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.27)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 2.66264258715
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 2.662642587148338, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.66)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: 2.21133827602
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 2.2113382760206592, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.21)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: None, reward: 1.19589386789
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 7, 'action': None, 'reward': 1.19589386788938, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.20)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: None, reward: 2.44509883309
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.445098833091774, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.45)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: left, reward: 2.2825123843
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'left', 'reward': 2.282512384300807, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.28)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: None, reward: 1.85234371906
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.85234371905765, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 1.85)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 4), heading: (0, 1), action: forward, reward: 2.18310268159
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 2.183102681589502, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.18)
52% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 371
\-------------------------
Environment.reset(): Trial set up with start = (8, 2), destination = (6, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0247; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: None, reward: 1.47491224978
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.4749122497816032, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'left')
Agent properly idled at a red light. (rewarded 1.47)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: None, reward: 2.44465600055
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.4446560005525733, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.44)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: None, reward: 1.60584079424
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.6058407942406754, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.61)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: None, reward: 2.72108896896
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.7210889689592257, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.72)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: forward, reward: 0.887708605166
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 0.8877086051657882, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 0.89)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: None, reward: 1.58608334217
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.5860833421725753, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.59)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: None, reward: 2.75109205296
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.751092052962166, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.75)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: left, reward: 1.48174270347
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 1.4817427034706334, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.48)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 0.983617202352
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 0.9836172023521916, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent followed the waypoint forward. (rewarded 0.98)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: 0.252094276933
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 11, 't': 9, 'action': None, 'reward': 0.25209427693317776, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 0.25)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: right, reward: 1.47645313722
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 1.4764531372233485, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.48)
45% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 372
\-------------------------
Environment.reset(): Trial set up with start = (8, 4), destination = (1, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0245; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: right, reward: 1.73548055443
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.7354805544285534, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 1.74)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: right, reward: 0.818346373986
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 0.8183463739856326, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 0.82)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: forward, reward: -9.80671124549
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': -9.806711245492648, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.81)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: forward, reward: 1.80538899918
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.8053889991798098, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.81)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: left, reward: -9.04450301045
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': -9.04450301044745, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent attempted driving left through a red light. (rewarded -9.04)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: None, reward: 2.38735848642
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.3873584864157094, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.39)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: None, reward: 1.16198379813
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'right'), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.1619837981331291, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 1.16)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: forward, reward: 2.19556261065
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.1955626106451804, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent followed the waypoint forward. (rewarded 2.20)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 373
\-------------------------
Environment.reset(): Trial set up with start = (6, 5), destination = (3, 2), deadline = 30
Simulating trial. . .
epsilon = 0.0242; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: None, reward: 2.74112971755
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 30, 't': 0, 'action': None, 'reward': 2.741129717548848, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.74)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: None, reward: 2.95196769716
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.95196769716054, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.95)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: None, reward: 2.5343317578
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.53433175779574, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.53)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: right, reward: 0.761592500344
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 27, 't': 3, 'action': 'right', 'reward': 0.7615925003438705, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 0.76)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: forward, reward: 1.17395555561
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 1.173955555609015, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.17)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: 1.99496474386
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 1.9949647438630354, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.99)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: None, reward: 2.24564408068
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.245644080680671, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.25)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: forward, reward: 2.75115799523
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 2.751157995227862, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.75)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: 1.07122685857
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 8, 'action': None, 'reward': 1.071226858570649, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.07)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: None, reward: 2.4023841284
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 9, 'action': None, 'reward': 2.402384128398691, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.40)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: forward, reward: 0.882188075683
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 0.8821880756827574, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.88)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: right, reward: 1.75322415691
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'right'), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 1.753224156914272, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'right')
Agent followed the waypoint right. (rewarded 1.75)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: forward, reward: 1.37319537604
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 12, 'action': 'forward', 'reward': 1.3731953760418216, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.37)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: None, reward: 1.94317694416
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 17, 't': 13, 'action': None, 'reward': 1.9431769441605709, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.94)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: None, reward: 2.12937910314
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 14, 'action': None, 'reward': 2.129379103140729, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.13)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: forward, reward: 1.62393336265
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 15, 'action': 'forward', 'reward': 1.6239333626463959, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.62)
47% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 374
\-------------------------
Environment.reset(): Trial set up with start = (5, 6), destination = (7, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0240; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: right, reward: 2.32112608363
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 2.32112608363137, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'left')
Agent followed the waypoint right. (rewarded 2.32)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: left, reward: 1.25884315782
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'left', 'reward': 1.2588431578185602, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove left instead of forward. (rewarded 1.26)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: forward, reward: 1.50313734273
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'forward'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 1.5031373427291057, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'forward')
Agent drove forward instead of right. (rewarded 1.50)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: right, reward: 1.64416267592
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 1.6441626759222707, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.64)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 5), heading: (0, 1), action: right, reward: 0.955395236433
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 0.9553952364334223, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 0.96)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: forward, reward: 1.0229310241
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.022931024098049, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.02)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: None, reward: 2.30849126442
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.3084912644186057, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.31)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: None, reward: 2.19814387824
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': 2.1981438782389384, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.20)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: None, reward: 1.85595893462
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.8559589346224818, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.86)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: None, reward: 2.19128335409
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 9, 'action': None, 'reward': 2.1912833540903516, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.19)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 6), heading: (0, 1), action: None, reward: 1.46017324589
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 10, 't': 10, 'action': None, 'reward': 1.4601732458947276, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.46)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 7), heading: (0, 1), action: forward, reward: 2.1105380653
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 2.1105380653018724, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 2.11)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: forward, reward: 1.26275616499
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 8, 't': 12, 'action': 'forward', 'reward': 1.2627561649925394, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.26)
35% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 375
\-------------------------
Environment.reset(): Trial set up with start = (4, 6), destination = (6, 4), deadline = 20
Simulating trial. . .
epsilon = 0.0238; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: forward, reward: 1.91332297005
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.913322970045279, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 1.91)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 2.1019350172
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.1019350172028344, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.10)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 2.65658545184
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.656585451843901, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.66)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 2.21525712902
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.2152571290202134, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.22)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 2.7743104529
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.7743104528968137, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 2.77)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: None, reward: 1.09274651183
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.092746511832527, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.09)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: forward, reward: 0.983981886796
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 0.9839818867956789, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.98)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: left, reward: 2.5880252669
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 2.5880252669020782, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.59)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 5), heading: (0, -1), action: None, reward: 1.90040130987
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.900401309872464, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.90)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: forward, reward: 1.22075460566
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 1.2207546056564846, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.22)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 376
\-------------------------
Environment.reset(): Trial set up with start = (6, 5), destination = (1, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0235; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: right, reward: 1.51667987132
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.5166798713214629, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'left')
Agent drove right instead of forward. (rewarded 1.52)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: left, reward: 1.57050387128
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': 1.57050387127769, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.57)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: forward, reward: 1.33769412325
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 1.3376941232524269, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.34)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: forward, reward: 1.1238353296
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.123835329599798, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.12)
80% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 377
\-------------------------
Environment.reset(): Trial set up with start = (8, 6), destination = (4, 7), deadline = 25
Simulating trial. . .
epsilon = 0.0233; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: 2.27404911648
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 25, 't': 0, 'action': None, 'reward': 2.2740491164750725, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.27)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: 2.84351066817
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.8435106681700533, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.84)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: 2.04184822404
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.04184822404108, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.04)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: 2.2131728896
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.213172889603494, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.21)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: left, reward: 1.03406865085
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'left', 'reward': 1.0340686508542707, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.03)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: None, reward: 2.01191247896
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.0119124789567584, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.01)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: None, reward: 1.96598921653
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.9659892165291668, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.97)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: left, reward: 2.84220522244
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 2.842205222438246, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.84)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 1.64330593288
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.6433059328819963, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.64)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: forward, reward: 0.901227731708
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 0.9012277317075472, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 0.90)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 2.57816743748
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 15, 't': 10, 'action': None, 'reward': 2.578167437477684, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.58)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 2.40816068954
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 11, 'action': None, 'reward': 2.4081606895387258, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.41)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 1.31871486112
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 12, 'action': None, 'reward': 1.318714861117939, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.32)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: forward, reward: 1.26150488863
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 1.2615048886255142, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.26)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: forward, reward: 0.819017147922
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 11, 't': 14, 'action': 'forward', 'reward': 0.8190171479215225, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 0.82)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 378
\-------------------------
Environment.reset(): Trial set up with start = (6, 4), destination = (2, 5), deadline = 25
Simulating trial. . .
epsilon = 0.0231; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 4), heading: (0, 1), action: None, reward: 2.80116359473
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 25, 't': 0, 'action': None, 'reward': 2.801163594731167, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.80)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 4), heading: (0, 1), action: None, reward: 1.78956491793
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.7895649179266433, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.79)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 4), heading: (0, 1), action: None, reward: 1.70046607183
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.7004660718257523, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.70)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 4), heading: (1, 0), action: left, reward: 2.84455233953
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'left', 'reward': 2.8445523395295016, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.84)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: forward, reward: 1.22050280845
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.2205028084514422, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.22)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 2.53196317616
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.531963176160518, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.53)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: None, reward: 1.62450186574
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.6245018657436714, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.62)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: forward, reward: 0.993158657304
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 0.9931586573039168, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.99)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: left, reward: -10.1761104953
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 8, 'action': 'left', 'reward': -10.176110495308698, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving left through a red light. (rewarded -10.18)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: forward, reward: 2.30835122104
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 2.3083512210407227, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.31)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: right, reward: 2.22011920083
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 2.220119200832787, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.22)
56% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 379
\-------------------------
Environment.reset(): Trial set up with start = (8, 6), destination = (1, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0228; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: right, reward: 1.46598881112
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.4659888111217545, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', None)
Agent followed the waypoint right. (rewarded 1.47)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 1.03990448344
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.0399044834383884, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.04)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: 1.58597979771
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.5859797977144343, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.59)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: forward, reward: 2.51092922309
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.510929223090397, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.51)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: None, reward: 1.94498614336
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.9449861433615676, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.94)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: forward, reward: 2.74775160772
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.747751607715417, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.75)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 380
\-------------------------
Environment.reset(): Trial set up with start = (2, 6), destination = (6, 4), deadline = 30
Simulating trial. . .
epsilon = 0.0226; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: right, reward: 1.73633466042
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 1.7363346604166439, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent followed the waypoint right. (rewarded 1.74)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: 2.55917917068
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 29, 't': 1, 'action': 'forward', 'reward': 2.5591791706763978, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 2.56)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: 2.10904260991
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.109042609910327, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.11)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: 2.65392374812
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 27, 't': 3, 'action': None, 'reward': 2.6539237481200546, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.65)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: forward, reward: 2.65816313872
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 2.6581631387160085, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.66)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 6), heading: (-1, 0), action: None, reward: 1.91863717499
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 25, 't': 5, 'action': None, 'reward': 1.918637174994073, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 1.92)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: right, reward: 0.637400006106
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'left'), 'deadline': 24, 't': 6, 'action': 'right', 'reward': 0.6374000061063435, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'left')
Agent drove right instead of forward. (rewarded 0.64)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 2.74063999105
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 23, 't': 7, 'action': None, 'reward': 2.7406399910498838, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.74)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 1.90885927169
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 22, 't': 8, 'action': None, 'reward': 1.9088592716897248, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.91)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 2.17414925282
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 21, 't': 9, 'action': None, 'reward': 2.1741492528195727, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.17)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 5), heading: (0, -1), action: None, reward: 2.02184021567
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 10, 'action': None, 'reward': 2.0218402156723236, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.02)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: right, reward: 1.21709277997
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'left'), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 1.217092779968321, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'left')
Agent drove right instead of left. (rewarded 1.22)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 2.55764381503
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 12, 'action': None, 'reward': 2.5576438150253598, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.56)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: left, reward: 2.41469651575
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 13, 'action': 'left', 'reward': 2.4146965157466274, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.41)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: 1.38865749339
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 16, 't': 14, 'action': None, 'reward': 1.3886574933896172, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.39)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 4), heading: (0, -1), action: None, reward: 1.90804604628
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 15, 't': 15, 'action': None, 'reward': 1.9080460462758138, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.91)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: right, reward: 0.102024265547
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 14, 't': 16, 'action': 'right', 'reward': 0.1020242655470538, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.10)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: None, reward: 0.0800787899692
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 13, 't': 17, 'action': None, 'reward': 0.08007878996920903, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 0.08)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (1, 5), heading: (0, 1), action: right, reward: 0.716457630815
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'left'), 'deadline': 12, 't': 18, 'action': 'right', 'reward': 0.7164576308152626, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'left')
Agent followed the waypoint right. (rewarded 0.72)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: right, reward: 1.54250778441
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 11, 't': 19, 'action': 'right', 'reward': 1.5425077844087198, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 1.54)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: None, reward: 2.28157554239
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 10, 't': 20, 'action': None, 'reward': 2.2815755423863173, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.28)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 1.52859999748
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 9, 't': 21, 'action': 'forward', 'reward': 1.528599997481027, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.53)
27% of time remaining to reach destination.
/-------------------
| Step 22 Results
\-------------------
Environment.step(): t = 22
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: None, reward: 1.52505069697
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 8, 't': 22, 'action': None, 'reward': 1.5250506969690707, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.53)
23% of time remaining to reach destination.
/-------------------
| Step 23 Results
\-------------------
Environment.step(): t = 23
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: None, reward: 2.24365660216
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 7, 't': 23, 'action': None, 'reward': 2.243656602160998, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.24)
20% of time remaining to reach destination.
/-------------------
| Step 24 Results
\-------------------
Environment.step(): t = 24
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: None, reward: 1.37405084793
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 6, 't': 24, 'action': None, 'reward': 1.3740508479306446, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.37)
17% of time remaining to reach destination.
/-------------------
| Step 25 Results
\-------------------
Environment.step(): t = 25
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: None, reward: 1.97752615272
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 5, 't': 25, 'action': None, 'reward': 1.977526152719353, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.98)
13% of time remaining to reach destination.
/-------------------
| Step 26 Results
\-------------------
Environment.step(): t = 26
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: forward, reward: 1.80624671511
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 4, 't': 26, 'action': 'forward', 'reward': 1.8062467151072332, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.81)
10% of time remaining to reach destination.
/-------------------
| Step 27 Results
\-------------------
Environment.step(): t = 27
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: right, reward: 0.626816889783
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'left'), 'deadline': 3, 't': 27, 'action': 'right', 'reward': 0.6268168897832764, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'left')
Agent followed the waypoint right. (rewarded 0.63)
7% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 381
\-------------------------
Environment.reset(): Trial set up with start = (7, 3), destination = (3, 4), deadline = 25
Simulating trial. . .
epsilon = 0.0224; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: None, reward: 1.9864397477
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.9864397476983724, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.99)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: None, reward: 1.28334431906
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.2833443190582745, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.28)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: None, reward: 1.16013095507
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.1601309550669683, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.16)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: left, reward: 1.44547005863
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 22, 't': 3, 'action': 'left', 'reward': 1.4454700586257383, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 1.45)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 1.3313242124
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 1.3313242123996059, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.33)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 2.55044486883
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.5504448688302377, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.55)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 2.80087983915
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.8008798391526923, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.80)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 2.24951275823
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.2495127582321643, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.25)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.05771180304
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.0577118030386767, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.06)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.41159828351
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.4115982835148229, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.41)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.04381912573
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.0438191257290192, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.04)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 2.37965844229
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 2.3796584422911415, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.38)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: right, reward: 1.8362730122
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 13, 't': 12, 'action': 'right', 'reward': 1.8362730121957553, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.84)
48% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 382
\-------------------------
Environment.reset(): Trial set up with start = (3, 2), destination = (6, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0221; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: right, reward: 0.953239243372
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 0.9532392433719161, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 0.95)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 1.57479133641
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 1.574791336413662, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.57)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: forward, reward: 2.14229169759
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 2.142291697591247, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.14)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 1.29340839756
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.2934083975593265, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.29)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: forward, reward: 2.24346811137
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.243468111374204, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.24)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: forward, reward: 0.940113714417
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 0.9401137144168774, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.94)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 2), heading: (-1, 0), action: forward, reward: -0.0298653022757
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'right'), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': -0.029865302275655203, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'right')
Agent drove forward instead of right. (rewarded -0.03)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 7), heading: (0, -1), action: right, reward: 1.919148061
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 1.9191480609987939, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.92)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: right, reward: 2.3439840358
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 2.343984035801345, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.34)
55% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 383
\-------------------------
Environment.reset(): Trial set up with start = (5, 7), destination = (8, 4), deadline = 30
Simulating trial. . .
epsilon = 0.0219; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: None, reward: 1.9854754267
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'right'), 'deadline': 30, 't': 0, 'action': None, 'reward': 1.985475426697219, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 1.99)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: None, reward: 1.20375099836
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.2037509983560684, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.20)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: None, reward: 1.7499049755
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.7499049755020242, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.75)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: None, reward: 2.13901954565
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 2.139019545647709, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.14)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: None, reward: 2.90173478528
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 26, 't': 4, 'action': None, 'reward': 2.90173478527998, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.90)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 7), heading: (1, 0), action: left, reward: 2.03745888388
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 25, 't': 5, 'action': 'left', 'reward': 2.0374588838806114, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.04)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: forward, reward: 0.987521615888
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 0.9875216158876867, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 0.99)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: None, reward: 2.5748002811
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 23, 't': 7, 'action': None, 'reward': 2.5748002810997392, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.57)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: None, reward: 2.62336681474
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.6233668147369715, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.62)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 7), heading: (1, 0), action: None, reward: 2.71292819632
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 9, 'action': None, 'reward': 2.71292819631912, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.71)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: forward, reward: 2.84424370742
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 2.844243707419297, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.84)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 1.3992646491
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 19, 't': 11, 'action': None, 'reward': 1.3992646491031668, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.40)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: right, reward: 2.09532826191
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 12, 'action': 'right', 'reward': 2.095328261914972, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.10)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: right, reward: 0.656509923839
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 17, 't': 13, 'action': 'right', 'reward': 0.6565099238392047, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent drove right instead of forward. (rewarded 0.66)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (7, 3), heading: (0, 1), action: left, reward: 1.38688452697
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 16, 't': 14, 'action': 'left', 'reward': 1.3868845269721854, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.39)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: left, reward: 1.82770527363
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 15, 't': 15, 'action': 'left', 'reward': 1.8277052736296135, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.83)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 4), heading: (0, 1), action: right, reward: 2.34584364951
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 14, 't': 16, 'action': 'right', 'reward': 2.345843649513288, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.35)
43% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 384
\-------------------------
Environment.reset(): Trial set up with start = (7, 6), destination = (3, 3), deadline = 35
Simulating trial. . .
epsilon = 0.0217; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: forward, reward: 1.26124622155
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 35, 't': 0, 'action': 'forward', 'reward': 1.2612462215527138, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.26)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 2.59279064838
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 34, 't': 1, 'action': None, 'reward': 2.5927906483813277, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.59)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 2.83866401903
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 33, 't': 2, 'action': None, 'reward': 2.838664019033431, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.84)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 1.54761344209
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 32, 't': 3, 'action': None, 'reward': 1.5476134420906953, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.55)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: forward, reward: 1.22508610074
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 31, 't': 4, 'action': 'forward', 'reward': 1.2250861007385565, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.23)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 2.06214652446
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 30, 't': 5, 'action': None, 'reward': 2.062146524463455, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.06)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 1.25554793803
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 29, 't': 6, 'action': None, 'reward': 1.25554793802931, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.26)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: 2.42424004918
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 28, 't': 7, 'action': 'forward', 'reward': 2.424240049177583, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.42)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: None, reward: 2.28601696
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 27, 't': 8, 'action': None, 'reward': 2.2860169599953286, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.29)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: 2.42973339916
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 26, 't': 9, 'action': 'forward', 'reward': 2.4297333991563956, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.43)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: right, reward: 1.03365708691
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 25, 't': 10, 'action': 'right', 'reward': 1.033657086913198, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.03)
69% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: None, reward: 2.71801129863
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 11, 'action': None, 'reward': 2.7180112986277214, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.72)
66% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: None, reward: 1.43973986424
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 12, 'action': None, 'reward': 1.4397398642389423, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.44)
63% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: forward, reward: 1.24787373004
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 13, 'action': 'forward', 'reward': 1.2478737300356593, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.25)
60% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: None, reward: 1.20387994159
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 14, 'action': None, 'reward': 1.203879941586838, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.20)
57% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: None, reward: 1.5229582202
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 20, 't': 15, 'action': None, 'reward': 1.5229582201961338, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.52)
54% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (3, 2), heading: (0, 1), action: None, reward: 2.45799470556
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 19, 't': 16, 'action': None, 'reward': 2.457994705564818, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.46)
51% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: forward, reward: 2.50934675361
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 18, 't': 17, 'action': 'forward', 'reward': 2.5093467536121543, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.51)
49% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 385
\-------------------------
Environment.reset(): Trial set up with start = (7, 7), destination = (2, 4), deadline = 30
Simulating trial. . .
epsilon = 0.0215; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: left, reward: 1.63822632939
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 30, 't': 0, 'action': 'left', 'reward': 1.638226329394342, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 1.64)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: left, reward: 2.82475304199
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 29, 't': 1, 'action': 'left', 'reward': 2.824753041993582, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.82)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: None, reward: 1.68948741408
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.6894874140832894, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.69)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: left, reward: 1.92855931455
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 27, 't': 3, 'action': 'left', 'reward': 1.9285593145526494, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.93)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: None, reward: 2.14593048367
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 26, 't': 4, 'action': None, 'reward': 2.1459304836711857, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.15)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: 2.67976398816
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 2.679763988158898, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.68)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: forward, reward: 2.25211188052
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 2.252111880516438, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.25)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 1.03048283043
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 7, 'action': None, 'reward': 1.0304828304299283, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.03)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 2.35987382531
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.359873825311082, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.36)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: left, reward: 0.682632574497
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 21, 't': 9, 'action': 'left', 'reward': 0.6826325744973805, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 0.68)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: right, reward: 1.65875278683
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 10, 'action': 'right', 'reward': 1.658752786834056, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.66)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: right, reward: 1.86770087953
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 1.8677008795292898, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.87)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: forward, reward: 1.18831078782
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 12, 'action': 'forward', 'reward': 1.1883107878227017, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.19)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: forward, reward: 1.02131984684
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 1.021319846837352, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.02)
53% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 386
\-------------------------
Environment.reset(): Trial set up with start = (7, 3), destination = (3, 5), deadline = 30
Simulating trial. . .
epsilon = 0.0213; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: forward, reward: 1.62547443617
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 30, 't': 0, 'action': 'forward', 'reward': 1.6254744361656333, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.63)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 1.53536130199
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 29, 't': 1, 'action': 'forward', 'reward': 1.5353613019883494, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.54)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: None, reward: 2.23602443635
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.2360244363483313, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.24)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 1.85565331155
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 27, 't': 3, 'action': 'forward', 'reward': 1.8556533115504559, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 1.86)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.58700461683
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 26, 't': 4, 'action': None, 'reward': 1.5870046168260572, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.59)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 1.2658690919
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 25, 't': 5, 'action': None, 'reward': 1.2658690918966118, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.27)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 2.15316138238
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 2.1531613823784, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.15)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: right, reward: 2.873370477
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 23, 't': 7, 'action': 'right', 'reward': 2.8733704769965422, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 2.87)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: None, reward: 2.65632752067
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.6563275206699957, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.66)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: left, reward: 0.29668796744
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'right'), 'deadline': 21, 't': 9, 'action': 'left', 'reward': 0.2966879674398182, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'right')
Agent drove left instead of forward. (rewarded 0.30)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: right, reward: 1.85445886008
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', None), 'deadline': 20, 't': 10, 'action': 'right', 'reward': 1.8544588600804224, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', None)
Agent followed the waypoint right. (rewarded 1.85)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 5), heading: (-1, 0), action: right, reward: 1.17020442613
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', None), 'deadline': 19, 't': 11, 'action': 'right', 'reward': 1.1702044261329831, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', None)
Agent followed the waypoint right. (rewarded 1.17)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 387
\-------------------------
Environment.reset(): Trial set up with start = (3, 4), destination = (8, 7), deadline = 30
Simulating trial. . .
epsilon = 0.0211; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: right, reward: 2.31009743333
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 2.31009743333221, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 2.31)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: forward, reward: 1.99563774458
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 29, 't': 1, 'action': 'forward', 'reward': 1.9956377445839135, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.00)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: 2.19203158725
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.192031587251072, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.19)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: forward, reward: 2.52765008933
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 27, 't': 3, 'action': 'forward', 'reward': 2.5276500893312255, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.53)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: right, reward: 2.75701621996
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 26, 't': 4, 'action': 'right', 'reward': 2.757016219956747, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.76)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: forward, reward: 1.41323036658
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 1.413230366580918, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.41)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: forward, reward: 1.60834002404
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.6083400240390895, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.61)
77% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 388
\-------------------------
Environment.reset(): Trial set up with start = (3, 4), destination = (7, 4), deadline = 20
Simulating trial. . .
epsilon = 0.0209; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: None, reward: 1.44728762191
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.4472876219086455, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.45)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: None, reward: 2.11987643812
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.1198764381170987, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.12)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: None, reward: 1.71233368203
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.7123336820263144, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.71)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: None, reward: 2.03910853341
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.039108533409434, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.04)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: None, reward: 1.76273460497
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.7627346049694184, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.76)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: left, reward: 2.40257260228
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'left', 'reward': 2.4025726022829907, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.40)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: None, reward: 1.85648482134
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.8564848213430867, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.86)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: forward, reward: 1.11639399561
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.116393995608562, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.12)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: 2.58150101995
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.5815010199539725, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.58)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: forward, reward: 2.77929453125
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 2.7792945312474022, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.78)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: None, reward: 0.888089633185
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 10, 't': 10, 'action': None, 'reward': 0.888089633185428, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 0.89)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: None, reward: 0.980899366365
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 9, 't': 11, 'action': None, 'reward': 0.9808993663648629, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.98)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: None, reward: 1.64839584173
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 8, 't': 12, 'action': None, 'reward': 1.6483958417318907, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.65)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: forward, reward: 1.15440211741
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 1.154402117407002, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.15)
30% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 389
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (4, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0207; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: left, reward: 1.65343309707
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.6534330970733417, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.65)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: left, reward: 2.01870217173
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': 'left', 'reward': 2.0187021717260896, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.02)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: forward, reward: 1.49559352012
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 1.4955935201237012, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.50)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: forward, reward: 1.65256369339
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.652563693388058, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.65)
80% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 390
\-------------------------
Environment.reset(): Trial set up with start = (8, 2), destination = (5, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0204; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: left, reward: 1.96780178122
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.9678017812152235, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 1.97)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 1.62730320858
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.6273032085804418, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.63)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 1.01977033982
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.0197703398204234, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.02)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: left, reward: 1.56249481869
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 1.5624948186867462, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.56)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: left, reward: 1.96090129549
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 1.960901295487166, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.96)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 1.02982630433
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.0298263043251248, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.03)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 1.95425890612
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.9542589061209057, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.95)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 0.956753244862
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 0.9567532448623035, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 0.96)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.94210737529
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.9421073752949107, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.94)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.41177169694
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.4117716969376175, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.41)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 1.72247210056
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 1.722472100555838, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.72)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: 2.55770995695
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 2.557709956950932, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.56)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 391
\-------------------------
Environment.reset(): Trial set up with start = (8, 6), destination = (4, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0202; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 2.11648767473
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 20, 't': 0, 'action': None, 'reward': 2.11648767473494, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.12)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 1.28064271176
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.2806427117636427, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.28)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 2.69932826025
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.699328260249572, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.70)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: forward, reward: -9.54428950292
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 2, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': -9.544289502918124, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent attempted driving forward through a red light. (rewarded -9.54)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: forward, reward: 1.03888987081
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.0388898708071272, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.04)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 1.32018122546
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.3201812254567742, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.32)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: None, reward: 2.85453619365
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.854536193648575, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.85)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: 1.34371546834
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'forward'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 1.3437154683441475, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'forward')
Agent followed the waypoint forward. (rewarded 1.34)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: 2.69395189043
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 2.6939518904299904, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.69)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 1.15242183031
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.1524218303073563, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.15)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 2.60391324759
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 10, 't': 10, 'action': None, 'reward': 2.603913247588202, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.60)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: 1.28973051776
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 1.2897305177556504, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.29)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 392
\-------------------------
Environment.reset(): Trial set up with start = (3, 6), destination = (8, 2), deadline = 25
Simulating trial. . .
epsilon = 0.0200; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: None, reward: 2.18213812063
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 25, 't': 0, 'action': None, 'reward': 2.1821381206319597, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.18)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: None, reward: 1.50227109962
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.5022710996187518, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.50)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: None, reward: 2.87383035848
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.873830358478987, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.87)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: left, reward: 1.35734979353
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'left', 'reward': 1.3573497935321663, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.36)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 2.32219959617
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.3221995961676143, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.32)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: forward, reward: 1.10829422879
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.1082942287926278, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.11)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: 2.28612998285
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 2.286129982845144, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.29)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: left, reward: 1.0913134869
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 1.091313486901817, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.09)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: None, reward: 1.58868734761
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.5886873476104026, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.59)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: None, reward: 1.35039090519
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.3503909051905638, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.35)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 2), heading: (0, 1), action: forward, reward: 1.54244466256
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 1.5424446625565238, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.54)
56% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 393
\-------------------------
Environment.reset(): Trial set up with start = (8, 2), destination = (3, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0198; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (0, -1), action: right, reward: 1.96877768011
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.9687776801068395, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.97)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: right, reward: 1.75061837842
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.7506183784189524, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.75)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: None, reward: 2.36218881626
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.362188816264669, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.36)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: forward, reward: 1.70861539506
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.7086153950615288, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.71)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 2.06221888987
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.0622188898725833, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.06)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 1.58925172146
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.5892517214597928, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.59)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 2.82357589735
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.8235758973475384, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.82)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: forward, reward: 2.4364692242
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.4364692241994965, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.44)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 394
\-------------------------
Environment.reset(): Trial set up with start = (7, 2), destination = (2, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0196; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 2), heading: (-1, 0), action: None, reward: 1.05638972698
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'forward'), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.0563897269836557, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.06)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: right, reward: 2.4776136138
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 2.4776136137993867, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.48)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: right, reward: 1.97780740173
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'right', 'left'), 'deadline': 23, 't': 2, 'action': 'right', 'reward': 1.977807401732935, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'right', 'left')
Agent followed the waypoint right. (rewarded 1.98)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: None, reward: 2.45720987369
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.4572098736916317, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.46)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: 2.70337948131
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.7033794813071785, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.70)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: forward, reward: 1.78291464315
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.7829146431487852, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent followed the waypoint forward. (rewarded 1.78)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 2.14278924919
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.1427892491927416, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.14)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: left, reward: 2.01232186457
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 2.01232186456743, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.01)
68% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 395
\-------------------------
Environment.reset(): Trial set up with start = (7, 6), destination = (1, 3), deadline = 25
Simulating trial. . .
epsilon = 0.0194; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 6), heading: (-1, 0), action: forward, reward: 0.157615977138
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'forward'), 'deadline': 25, 't': 0, 'action': 'forward', 'reward': 0.15761597713778586, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'forward')
Agent drove forward instead of left. (rewarded 0.16)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: left, reward: 2.02667948494
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 24, 't': 1, 'action': 'left', 'reward': 2.026679484935796, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.03)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: None, reward: 2.82124833043
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'right'), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.8212483304338676, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 2.82)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: right, reward: 1.00167383867
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 22, 't': 3, 'action': 'right', 'reward': 1.0016738386741983, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.00)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: None, reward: 1.12289841383
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 1.1228984138267382, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.12)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: left, reward: 1.152400921
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'right'), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 1.1524009209975794, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'right')
Agent followed the waypoint left. (rewarded 1.15)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: left, reward: 1.21868012218
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 19, 't': 6, 'action': 'left', 'reward': 1.2186801221769752, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 1.22)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: None, reward: 1.63067474647
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 7, 'action': None, 'reward': 1.6306747464734455, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.63)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 2), heading: (1, 0), action: None, reward: 0.961179214735
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 17, 't': 8, 'action': None, 'reward': 0.961179214734625, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 0.96)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 2), heading: (1, 0), action: forward, reward: 1.22589280085
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 1.225892800854831, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.23)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: forward, reward: 1.54481404683
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 1.5448140468327354, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.54)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: 2.3110124866
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 14, 't': 11, 'action': None, 'reward': 2.311012486598865, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.31)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: None, reward: 1.43635978783
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 12, 'action': None, 'reward': 1.4363597878347436, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.44)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: forward, reward: 1.55644166979
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 1.5564416697874668, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.56)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 3), heading: (0, 1), action: right, reward: 2.47116638761
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 11, 't': 14, 'action': 'right', 'reward': 2.4711663876065737, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 2.47)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 396
\-------------------------
Environment.reset(): Trial set up with start = (4, 4), destination = (1, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0193; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: right, reward: 1.35282418781
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.3528241878054321, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.35)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 4), heading: (-1, 0), action: forward, reward: 1.51323251548
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 1.5132325154760422, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.51)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: forward, reward: 2.4562781896
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 2.4562781895952126, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.46)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: right, reward: 0.638277676976
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 0.6382776769760511, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 0.64)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: left, reward: 1.8018300275
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', 'forward'), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 1.801830027497505, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', 'forward')
Agent drove left instead of right. (rewarded 1.80)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (8, 2), heading: (0, -1), action: right, reward: 1.76141375666
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'left'), 'deadline': 15, 't': 5, 'action': 'right', 'reward': 1.7614137566627697, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'left')
Agent drove right instead of left. (rewarded 1.76)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: right, reward: 1.49429079843
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.4942907984294733, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.49)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: forward, reward: 0.904838309144
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 0.9048383091436851, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 0.90)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: left, reward: 1.02248390431
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 12, 't': 8, 'action': 'left', 'reward': 1.022483904307466, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.02)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 7), heading: (0, -1), action: None, reward: 1.3000182454
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.3000182454004596, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 1.30)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: right, reward: 1.3306498179
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 10, 't': 10, 'action': 'right', 'reward': 1.3306498178972022, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.33)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 2.5635972457
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 9, 't': 11, 'action': None, 'reward': 2.5635972456996052, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.56)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: None, reward: 1.30600509554
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 8, 't': 12, 'action': None, 'reward': 1.3060050955372435, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.31)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: left, reward: 1.07518624477
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 7, 't': 13, 'action': 'left', 'reward': 1.0751862447666922, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.08)
30% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (3, 6), heading: (0, -1), action: None, reward: 2.28859902003
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 6, 't': 14, 'action': None, 'reward': 2.288599020032084, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.29)
25% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: left, reward: 2.2854874666
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 5, 't': 15, 'action': 'left', 'reward': 2.285487466597015, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 2.29)
20% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 1.45561636583
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 4, 't': 16, 'action': None, 'reward': 1.455616365827883, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.46)
15% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: forward, reward: 0.445339318177
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 3, 't': 17, 'action': 'forward', 'reward': 0.445339318177179, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.45)
10% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: right, reward: 1.81625443854
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 2, 't': 18, 'action': 'right', 'reward': 1.8162544385376647, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.82)
5% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 397
\-------------------------
Environment.reset(): Trial set up with start = (1, 4), destination = (4, 7), deadline = 30
Simulating trial. . .
epsilon = 0.0191; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: None, reward: 1.76159120556
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 30, 't': 0, 'action': None, 'reward': 1.7615912055578462, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.76)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: None, reward: 1.53724907553
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.5372490755283297, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.54)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 4), heading: (0, 1), action: None, reward: 1.3873157892
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.3873157891977967, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.39)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: left, reward: 2.29967244221
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 27, 't': 3, 'action': 'left', 'reward': 2.2996724422144035, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.30)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 1.73145869555
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 26, 't': 4, 'action': None, 'reward': 1.73145869555059, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.73)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 1.55333163944
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 25, 't': 5, 'action': None, 'reward': 1.553331639439107, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.55)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 2.09222730663
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.0922273066286436, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 2.09)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: forward, reward: 2.55903968347
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 2.559039683468257, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.56)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 1.87425025909
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 22, 't': 8, 'action': None, 'reward': 1.8742502590861985, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 1.87)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: forward, reward: 1.8711951797
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 1.8711951796967083, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.87)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 4), heading: (1, 0), action: None, reward: 2.27867888921
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 20, 't': 10, 'action': None, 'reward': 2.2786788892092114, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.28)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: right, reward: -0.00140241589573
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 19, 't': 11, 'action': 'right', 'reward': -0.0014024158957272048, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded -0.00)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: None, reward: 2.04976253497
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 12, 'action': None, 'reward': 2.0497625349689903, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.05)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: forward, reward: 1.74541877588
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 1.7454187758765407, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 1.75)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: None, reward: 2.49624077813
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 16, 't': 14, 'action': None, 'reward': 2.4962407781347027, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.50)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 6), heading: (0, 1), action: None, reward: 2.20147286051
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 15, 't': 15, 'action': None, 'reward': 2.201472860510663, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.20)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: forward, reward: 2.56660849496
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 16, 'action': 'forward', 'reward': 2.566608494958234, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.57)
43% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 398
\-------------------------
Environment.reset(): Trial set up with start = (7, 7), destination = (4, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0189; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 2), heading: (0, 1), action: right, reward: 1.54340254738
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.5434025473801067, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.54)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 2), heading: (-1, 0), action: right, reward: 1.67291672553
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.6729167255267916, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.67)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: right, reward: 0.345157825486
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'forward'), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 0.34515782548606155, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'forward')
Agent drove right instead of forward. (rewarded 0.35)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: left, reward: 1.46139659429
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 1.4613965942889764, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.46)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: forward, reward: 1.64603188478
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.6460318847815025, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.65)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: None, reward: 2.13551082731
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.135510827308788, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.14)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 7), heading: (-1, 0), action: None, reward: 2.84119544312
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.8411954431225492, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.84)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: left, reward: 2.02907014732
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 2.029070147324501, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.03)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 399
\-------------------------
Environment.reset(): Trial set up with start = (8, 2), destination = (4, 4), deadline = 30
Simulating trial. . .
epsilon = 0.0187; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 2.19765166445
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 30, 't': 0, 'action': None, 'reward': 2.1976516644512842, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.20)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 2.98361533166
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.9836153316561624, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.98)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 2.60679184499
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.6067918449904646, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.61)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 2), heading: (-1, 0), action: None, reward: 1.93446279878
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.9344627987810625, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.93)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 3), heading: (0, 1), action: left, reward: 2.55389174446
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 26, 't': 4, 'action': 'left', 'reward': 2.553891744462179, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.55)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: right, reward: 1.3928766839
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', None), 'deadline': 25, 't': 5, 'action': 'right', 'reward': 1.3928766839017435, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', None)
Agent drove right instead of left. (rewarded 1.39)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: None, reward: 2.75633377021
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.756333770208694, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.76)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: forward, reward: 2.81176008331
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 2.8117600833089345, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.81)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: None, reward: 2.32493390734
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 8, 'action': None, 'reward': 2.3249339073411557, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.32)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 3), heading: (-1, 0), action: forward, reward: 0.95593482658
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 0.9559348265795848, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 0.96)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 3), heading: (-1, 0), action: forward, reward: 2.49255992894
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 20, 't': 10, 'action': 'forward', 'reward': 2.492559928943745, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.49)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: left, reward: 1.19979090237
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 19, 't': 11, 'action': 'left', 'reward': 1.1997909023663473, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.20)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 400
\-------------------------
Environment.reset(): Trial set up with start = (1, 2), destination = (3, 4), deadline = 20
Simulating trial. . .
epsilon = 0.0185; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: None, reward: 1.46519809814
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.465198098144024, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.47)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: None, reward: 2.34476969598
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.3447696959815274, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.34)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: None, reward: 1.39050416769
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.3905041676930867, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.39)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: left, reward: 2.54150163064
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 2.541501630642226, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.54)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: None, reward: 0.979793128659
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 0.9797931286592199, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.98)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: forward, reward: 1.79805659472
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.7980565947235978, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.80)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: right, reward: 1.40883963271
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 1.4088396327115296, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.41)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: None, reward: 2.54073503884
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 7, 'action': None, 'reward': 2.540735038843181, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.54)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 3), heading: (0, 1), action: None, reward: 1.08194954062
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.0819495406228725, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.08)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 4), heading: (0, 1), action: forward, reward: 1.0455159654
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 1.0455159653954544, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.05)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 401
\-------------------------
Environment.reset(): Trial set up with start = (2, 3), destination = (6, 6), deadline = 35
Simulating trial. . .
epsilon = 0.0183; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: None, reward: 1.34791380042
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'forward'), 'deadline': 35, 't': 0, 'action': None, 'reward': 1.347913800421242, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 1.35)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: None, reward: 2.4347404466
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 34, 't': 1, 'action': None, 'reward': 2.4347404465999616, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.43)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: None, reward: 1.60965734341
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 33, 't': 2, 'action': None, 'reward': 1.6096573434075545, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.61)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: None, reward: 2.22513165111
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 32, 't': 3, 'action': None, 'reward': 2.225131651109468, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.23)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: None, reward: 2.25753142651
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 31, 't': 4, 'action': None, 'reward': 2.2575314265059165, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.26)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (0, -1), action: None, reward: 1.91807888798
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 30, 't': 5, 'action': None, 'reward': 1.918078887975363, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.92)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 3), heading: (-1, 0), action: left, reward: 1.71886453151
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 29, 't': 6, 'action': 'left', 'reward': 1.7188645315081525, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.72)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: forward, reward: 1.79059884987
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 28, 't': 7, 'action': 'forward', 'reward': 1.7905988498717302, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.79)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: None, reward: 2.64538358643
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 27, 't': 8, 'action': None, 'reward': 2.6453835864347814, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.65)
74% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: forward, reward: 1.83628477527
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 26, 't': 9, 'action': 'forward', 'reward': 1.8362847752738944, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.84)
71% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 3), heading: (-1, 0), action: forward, reward: 1.99001850042
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 25, 't': 10, 'action': 'forward', 'reward': 1.9900185004226107, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 1.99)
69% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: right, reward: 1.24274349764
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 24, 't': 11, 'action': 'right', 'reward': 1.2427434976359315, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.24)
66% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: forward, reward: 1.80155672946
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 23, 't': 12, 'action': 'forward', 'reward': 1.8015567294558619, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.80)
63% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: None, reward: 1.33438399716
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 13, 'action': None, 'reward': 1.3343839971630476, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.33)
60% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: None, reward: 2.72306886698
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 14, 'action': None, 'reward': 2.723068866979562, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.72)
57% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 6), heading: (0, -1), action: forward, reward: 2.29498917796
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 20, 't': 15, 'action': 'forward', 'reward': 2.2949891779634237, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 2.29)
54% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 402
\-------------------------
Environment.reset(): Trial set up with start = (7, 3), destination = (4, 5), deadline = 25
Simulating trial. . .
epsilon = 0.0181; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: right, reward: 1.5289948426
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'right'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.528994842601864, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'right')
Agent drove right instead of left. (rewarded 1.53)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 2.73223764109
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.732237641087019, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.73)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 1.13989804761
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.1398980476077407, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.14)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 2.24656962765
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.2465696276450586, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.25)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 2.89703955851
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.897039558510396, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.90)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 1.38832650252
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.3883265025230291, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.39)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 2.5642107592
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 2.5642107591952987, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.56)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 2.10515653894
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.1051565389363507, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 2.11)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 2.66916975626
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.669169756263401, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.67)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: 2.01564009489
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 2.0156400948857165, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.02)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: right, reward: 2.10330657976
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 2.103306579759587, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.10)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: None, reward: 2.11443740454
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 14, 't': 11, 'action': None, 'reward': 2.1144374045395833, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.11)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: None, reward: 2.36357540611
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 13, 't': 12, 'action': None, 'reward': 2.3635754061126057, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.36)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: None, reward: 2.34482599989
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 13, 'action': None, 'reward': 2.34482599989282, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.34)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: None, reward: 1.61342514448
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 11, 't': 14, 'action': None, 'reward': 1.6134251444776937, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.61)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: None, reward: 2.16484329363
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'left'), 'deadline': 10, 't': 15, 'action': None, 'reward': 2.164843293625882, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 2.16)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: forward, reward: 1.97512551218
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 9, 't': 16, 'action': 'forward', 'reward': 1.9751255121771931, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.98)
32% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 403
\-------------------------
Environment.reset(): Trial set up with start = (5, 3), destination = (7, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0180; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 1.93260797421
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', 'left'), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.9326079742079272, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', 'left')
Agent properly idled at a red light. (rewarded 1.93)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 2.53867186039
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.5386718603877787, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.54)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 2.57113210486
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.5711321048611175, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.57)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 1.61222989544
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.6122298954361494, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.61)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 2.2171633474
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'right'), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.2171633474029546, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 2.22)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 2.32655031331
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.326550313314641, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.33)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: left, reward: 2.8903121044
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 19, 't': 6, 'action': 'left', 'reward': 2.890312104403143, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.89)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: None, reward: 1.87881262546
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 7, 'action': None, 'reward': 1.8788126254620692, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.88)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: None, reward: 1.46770189833
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.4677018983290762, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.47)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: forward, reward: 1.49582833668
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 1.4958283366817962, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.50)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: None, reward: 1.42766459997
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 15, 't': 10, 'action': None, 'reward': 1.4276645999684943, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.43)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: left, reward: 1.48171005676
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 14, 't': 11, 'action': 'left', 'reward': 1.4817100567590102, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.48)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (7, 2), heading: (0, -1), action: None, reward: 2.10110559096
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 13, 't': 12, 'action': None, 'reward': 2.1011055909551715, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.10)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (7, 7), heading: (0, -1), action: forward, reward: 2.64051483196
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 12, 't': 13, 'action': 'forward', 'reward': 2.640514831963145, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.64)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 6), heading: (0, -1), action: forward, reward: 1.99033910087
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 11, 't': 14, 'action': 'forward', 'reward': 1.990339100868814, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.99)
40% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 404
\-------------------------
Environment.reset(): Trial set up with start = (2, 5), destination = (5, 3), deadline = 25
Simulating trial. . .
epsilon = 0.0178; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: left, reward: 2.05599744317
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'forward'), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 2.0559974431738812, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'forward')
Agent followed the waypoint left. (rewarded 2.06)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: forward, reward: 1.33928791785
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 1.3392879178523207, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.34)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 1.02208219995
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.0220821999487033, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.02)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 2.23570887462
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.235708874622919, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.24)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 5), heading: (1, 0), action: None, reward: 0.963885167874
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': 0.9638851678744103, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.96)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: forward, reward: 1.9029996297
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.9029996297041338, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.90)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: left, reward: 2.90297940316
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 19, 't': 6, 'action': 'left', 'reward': 2.9029794031603524, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.90)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: None, reward: 2.60427071648
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 7, 'action': None, 'reward': 2.6042707164817562, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.60)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: None, reward: 1.4787108285
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.4787108285048987, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.48)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: None, reward: 0.920579014908
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 16, 't': 9, 'action': None, 'reward': 0.9205790149075321, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.92)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 4), heading: (0, -1), action: None, reward: 2.09403402144
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 15, 't': 10, 'action': None, 'reward': 2.094034021435882, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.09)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 3), heading: (0, -1), action: forward, reward: 2.56184670117
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'forward', 'reward': 2.5618467011739394, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.56)
52% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 405
\-------------------------
Environment.reset(): Trial set up with start = (5, 6), destination = (1, 2), deadline = 30
Simulating trial. . .
epsilon = 0.0176; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 6), heading: (1, 0), action: forward, reward: 1.84393081247
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'right'), 'deadline': 30, 't': 0, 'action': 'forward', 'reward': 1.843930812468783, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'right')
Agent followed the waypoint forward. (rewarded 1.84)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 6), heading: (1, 0), action: forward, reward: 1.72105866958
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', 'right'), 'deadline': 29, 't': 1, 'action': 'forward', 'reward': 1.7210586695797132, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', 'right')
Agent followed the waypoint forward. (rewarded 1.72)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: forward, reward: 1.04279422703
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 28, 't': 2, 'action': 'forward', 'reward': 1.0427942270289157, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.04)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 6), heading: (1, 0), action: None, reward: 2.76568225474
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 2.765682254744761, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.77)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: forward, reward: 2.02470828172
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 2.0247082817232043, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.02)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 1.79459038518
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 25, 't': 5, 'action': 'right', 'reward': 1.7945903851797214, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.79)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: None, reward: 2.59339158123
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.593391581227367, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.59)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: forward, reward: 1.75629355241
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 1.7562935524123642, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.76)
73% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 406
\-------------------------
Environment.reset(): Trial set up with start = (2, 4), destination = (7, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0174; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: right, reward: 1.12966788532
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.1296678853211355, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.13)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: right, reward: 1.037593785
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.03759378499793, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent drove right instead of forward. (rewarded 1.04)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: left, reward: 1.58460097034
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 18, 't': 2, 'action': 'left', 'reward': 1.5846009703354023, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.58)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: None, reward: 1.15890021119
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.1589002111922047, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.16)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 3), heading: (-1, 0), action: None, reward: 2.05883116924
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.05883116923546, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.06)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 3), heading: (-1, 0), action: forward, reward: 1.73869050979
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.7386905097902503, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.74)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 4), heading: (0, 1), action: left, reward: 1.89221440885
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 14, 't': 6, 'action': 'left', 'reward': 1.8922144088512662, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.89)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (7, 5), heading: (0, 1), action: forward, reward: 2.00785609506
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.007856095060716, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 2.01)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 407
\-------------------------
Environment.reset(): Trial set up with start = (7, 2), destination = (3, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0172; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 2), heading: (1, 0), action: left, reward: 2.73440599026
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'right'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 2.7344059902562377, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'right')
Agent followed the waypoint left. (rewarded 2.73)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: forward, reward: 2.74758862222
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 2.7475886222249075, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.75)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 2), heading: (1, 0), action: None, reward: 2.7586752186
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.7586752186005543, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.76)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 2), heading: (1, 0), action: forward, reward: 1.87049296675
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 1.870492966750743, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.87)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: forward, reward: 2.79739836015
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 2.7973983601529597, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.80)
75% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 408
\-------------------------
Environment.reset(): Trial set up with start = (6, 3), destination = (2, 5), deadline = 30
Simulating trial. . .
epsilon = 0.0171; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: forward, reward: 2.49504886748
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 30, 't': 0, 'action': 'forward', 'reward': 2.4950488674802656, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.50)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: None, reward: 2.84516489905
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.84516489905247, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.85)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 3), heading: (1, 0), action: None, reward: 1.85663964966
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 1.8566396496632973, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.86)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: forward, reward: 2.54808484832
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 27, 't': 3, 'action': 'forward', 'reward': 2.548084848316906, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.55)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 3), heading: (1, 0), action: None, reward: 2.48393145365
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 26, 't': 4, 'action': None, 'reward': 2.483931453650986, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.48)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: forward, reward: 2.08513705336
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 2.0851370533583924, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.09)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 2.55146117853
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 2.5514611785271892, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.55)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: right, reward: 1.20354867834
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 23, 't': 7, 'action': 'right', 'reward': 1.2035486783353608, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.20)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (2, 5), heading: (0, 1), action: forward, reward: 1.19341192552
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 8, 'action': 'forward', 'reward': 1.1934119255249953, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.19)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 409
\-------------------------
Environment.reset(): Trial set up with start = (6, 6), destination = (3, 4), deadline = 25
Simulating trial. . .
epsilon = 0.0169; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: right, reward: 1.41871647301
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'right'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.4187164730134867, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'right')
Agent followed the waypoint right. (rewarded 1.42)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 2.45542993979
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.4554299397940618, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.46)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 1.30057598676
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.3005759867560003, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.30)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 1.06039837612
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.06039837611915, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.06)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 2.95039167287
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.9503916728716835, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.95)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 6), heading: (-1, 0), action: None, reward: 1.64091334624
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.6409133462385006, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.64)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: forward, reward: 2.2968070656
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 2.296807065601196, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.30)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: None, reward: 2.4512520393
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 7, 'action': None, 'reward': 2.451252039296118, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.45)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: None, reward: 1.63561522125
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.6356152212515724, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.64)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (4, 6), heading: (-1, 0), action: None, reward: 1.93967957488
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 9, 'action': None, 'reward': 1.9396795748782036, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.94)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: forward, reward: 0.977156781625
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 0.9771567816252971, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.98)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 5), heading: (0, -1), action: right, reward: 2.00499273393
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 2.0049927339322258, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.00)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 4), heading: (0, -1), action: forward, reward: 2.33406086793
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 12, 'action': 'forward', 'reward': 2.334060867929022, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.33)
48% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 410
\-------------------------
Environment.reset(): Trial set up with start = (2, 5), destination = (4, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0167; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (3, 5), heading: (1, 0), action: right, reward: 1.49737456566
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.4973745656645796, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.50)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: right, reward: 1.67726655325
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.6772665532521231, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent drove right instead of forward. (rewarded 1.68)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 6), heading: (0, 1), action: None, reward: 2.1531973443
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.15319734429872, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.15)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: left, reward: 2.3560066357
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 2.3560066356985097, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.36)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: right, reward: 2.02269038946
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'right', 'reward': 2.022690389457037, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.02)
75% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 411
\-------------------------
Environment.reset(): Trial set up with start = (8, 6), destination = (4, 3), deadline = 35
Simulating trial. . .
epsilon = 0.0166; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 6), heading: (1, 0), action: right, reward: 2.16926918971
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 35, 't': 0, 'action': 'right', 'reward': 2.169269189713867, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.17)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: forward, reward: 2.25487440836
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 34, 't': 1, 'action': 'forward', 'reward': 2.254874408357707, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.25)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: 1.13923872012
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 33, 't': 2, 'action': 'forward', 'reward': 1.139238720120569, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.14)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 2.88395562297
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 32, 't': 3, 'action': None, 'reward': 2.88395562296766, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.88)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 2.82003863028
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 31, 't': 4, 'action': None, 'reward': 2.820038630276195, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.82)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: 2.08549064583
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 30, 't': 5, 'action': 'forward', 'reward': 2.085490645826577, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.09)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 7), heading: (0, 1), action: right, reward: 1.15036376334
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 29, 't': 6, 'action': 'right', 'reward': 1.1503637633358212, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.15)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 2), heading: (0, 1), action: forward, reward: 1.91056626234
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 28, 't': 7, 'action': 'forward', 'reward': 1.9105662623448285, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.91)
77% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 3), heading: (0, 1), action: forward, reward: 1.07576934865
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 27, 't': 8, 'action': 'forward', 'reward': 1.0757693486539062, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.08)
74% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 412
\-------------------------
Environment.reset(): Trial set up with start = (1, 5), destination = (5, 7), deadline = 30
Simulating trial. . .
epsilon = 0.0164; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 1.08570872624
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'right', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'right'), 'deadline': 30, 't': 0, 'action': None, 'reward': 1.0857087262410856, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 1.09)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 1.88203690225
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.882036902251011, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.88)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: None, reward: 2.82499920952
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'right'), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.8249992095152363, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'right')
Agent properly idled at a red light. (rewarded 2.82)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 1.35963619949
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 27, 't': 3, 'action': 'forward', 'reward': 1.3596361994876025, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.36)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 1.0370909737
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 1.0370909736986311, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.04)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: forward, reward: 2.85627473839
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 2.856274738393047, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.86)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 5), heading: (-1, 0), action: forward, reward: 1.9931429224
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.9931429223959152, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.99)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 6), heading: (0, 1), action: left, reward: 1.67277740556
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'left', 'reward': 1.6727774055623612, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.67)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: forward, reward: 2.18837658416
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 22, 't': 8, 'action': 'forward', 'reward': 2.1883765841575293, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.19)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 413
\-------------------------
Environment.reset(): Trial set up with start = (7, 4), destination = (3, 3), deadline = 25
Simulating trial. . .
epsilon = 0.0162; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 4), heading: (1, 0), action: right, reward: 1.2276825939
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.2276825939000175, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.23)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: forward, reward: 2.64043006955
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 2.6404300695453573, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.64)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 4), heading: (1, 0), action: None, reward: 1.16804032045
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.168040320445742, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.17)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: forward, reward: 2.62684501925
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 2.626845019251869, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.63)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 2.81116073702
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.8111607370207063, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.81)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 2.25049019316
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.250490193164371, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.25)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 4), heading: (1, 0), action: None, reward: 2.77649564103
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.776495641034975, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.78)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: forward, reward: 2.89811530448
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.8981153044800987, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.90)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: None, reward: 2.13345258213
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 17, 't': 8, 'action': None, 'reward': 2.1334525821278327, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.13)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: left, reward: 1.94031758362
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 16, 't': 9, 'action': 'left', 'reward': 1.9403175836168052, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.94)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 414
\-------------------------
Environment.reset(): Trial set up with start = (2, 4), destination = (6, 4), deadline = 20
Simulating trial. . .
epsilon = 0.0161; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: left, reward: 1.99842077411
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 1.9984207741121056, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.00)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: 1.01483854898
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 1.0148385489765004, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.01)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: None, reward: 2.71549459914
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.715494599136873, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.72)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 4), heading: (-1, 0), action: forward, reward: 2.91095144822
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.910951448215527, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.91)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: forward, reward: 1.93564268344
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.935642683435943, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.94)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 2.75045825453
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.75045825452552, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.75)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 4), heading: (-1, 0), action: None, reward: 0.915896145395
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 14, 't': 6, 'action': None, 'reward': 0.9158961453951042, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.92)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 4), heading: (-1, 0), action: forward, reward: 2.78155518414
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 2.7815551841388437, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.78)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 415
\-------------------------
Environment.reset(): Trial set up with start = (5, 5), destination = (1, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0159; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: right, reward: 2.099797057
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 2.0997970569986357, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.10)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: None, reward: 2.26215943493
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.2621594349313643, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.26)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: None, reward: 2.36445452283
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.3644545228312435, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.36)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: forward, reward: 1.81409268912
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.814092689123682, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.81)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 2.13778926355
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.1377892635459856, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.14)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 1.58406498715
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 20, 't': 5, 'action': None, 'reward': 1.584064987154071, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.58)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: forward, reward: 2.58179614595
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 19, 't': 6, 'action': 'forward', 'reward': 2.581796145952873, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.58)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: 1.22572564957
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.2257256495704663, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.23)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: right, reward: 2.13863748645
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 2.1386374864474718, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.14)
64% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 416
\-------------------------
Environment.reset(): Trial set up with start = (2, 6), destination = (5, 3), deadline = 30
Simulating trial. . .
epsilon = 0.0158; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 1.6287013654
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 30, 't': 0, 'action': None, 'reward': 1.6287013654024833, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.63)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 1.50974127878
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 29, 't': 1, 'action': None, 'reward': 1.5097412787761313, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.51)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 2.71915054142
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.719150541420932, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.72)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 1.41634663881
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.4163466388137063, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.42)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (-1, 0), action: None, reward: 1.2159669887
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 26, 't': 4, 'action': None, 'reward': 1.2159669887022726, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.22)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: left, reward: 2.04566349527
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 25, 't': 5, 'action': 'left', 'reward': 2.045663495274453, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.05)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 7), heading: (0, 1), action: None, reward: 2.90717183836
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 24, 't': 6, 'action': None, 'reward': 2.9071718383599867, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.91)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: left, reward: 2.4599616801
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 23, 't': 7, 'action': 'left', 'reward': 2.45996168009551, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.46)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: forward, reward: 1.12533948462
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 22, 't': 8, 'action': 'forward', 'reward': 1.1253394846175928, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.13)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 7), heading: (1, 0), action: forward, reward: 0.951604883955
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 0.9516048839549878, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.95)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: right, reward: 2.8114288656
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'left'), 'deadline': 20, 't': 10, 'action': 'right', 'reward': 2.811428865603026, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'left')
Agent followed the waypoint right. (rewarded 2.81)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: None, reward: 2.46546385958
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 11, 'action': None, 'reward': 2.4654638595846894, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.47)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: None, reward: 2.56440978755
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 18, 't': 12, 'action': None, 'reward': 2.564409787549243, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.56)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: forward, reward: 2.53613529783
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 2.5361352978342144, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.54)
53% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 417
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (3, 2), deadline = 20
Simulating trial. . .
epsilon = 0.0156; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: right, reward: 1.82452098833
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.8245209883263638, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.82)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: 1.7438572513
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'right', None), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 1.743857251304525, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', None)
Agent followed the waypoint forward. (rewarded 1.74)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 7), heading: (0, 1), action: right, reward: 2.68315503688
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.683155036881341, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 2.68)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 7), heading: (-1, 0), action: right, reward: 0.0132929257087
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'left'), 'deadline': 17, 't': 3, 'action': 'right', 'reward': 0.013292925708724757, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'left')
Agent drove right instead of forward. (rewarded 0.01)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: left, reward: 2.68361049473
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 2.683610494726796, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.68)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 1.65205794018
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'right'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.6520579401770494, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.65)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 2.29970007497
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'forward'), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.299700074966175, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.30)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 3), heading: (0, 1), action: forward, reward: 0.0962488609781
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 13, 't': 7, 'action': 'forward', 'reward': 0.09624886097806806, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 0.10)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 4), heading: (0, 1), action: forward, reward: 0.733316184575
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 12, 't': 8, 'action': 'forward', 'reward': 0.7333161845747712, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 0.73)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 4), heading: (1, 0), action: left, reward: 1.97493887248
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 11, 't': 9, 'action': 'left', 'reward': 1.9749388724832528, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 1.97)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: left, reward: 2.10654376075
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 10, 't': 10, 'action': 'left', 'reward': 2.1065437607499593, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.11)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: None, reward: 1.50935940414
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 9, 't': 11, 'action': None, 'reward': 1.5093594041424818, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.51)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: None, reward: 2.45383355022
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 8, 't': 12, 'action': None, 'reward': 2.4538335502173583, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.45)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: forward, reward: 2.32982275133
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 2.3298227513310055, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.33)
30% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 418
\-------------------------
Environment.reset(): Trial set up with start = (1, 2), destination = (8, 5), deadline = 20
Simulating trial. . .
epsilon = 0.0155; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: left, reward: 2.5970361458
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'left'), 'deadline': 20, 't': 0, 'action': 'left', 'reward': 2.5970361458006, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'left')
Agent followed the waypoint left. (rewarded 2.60)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: right, reward: 1.52472331125
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'forward'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.5247233112485743, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'forward')
Agent drove right instead of left. (rewarded 1.52)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 2.68792423125
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.687924231248635, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.69)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 2.40682629529
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.4068262952919492, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.41)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: left, reward: 0.982712645963
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 16, 't': 4, 'action': 'left', 'reward': 0.9827126459632538, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 0.98)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: None, reward: 1.22407835302
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.2240783530151333, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.22)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: None, reward: 2.34617255795
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'left'), 'deadline': 14, 't': 6, 'action': None, 'reward': 2.346172557953923, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 2.35)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: None, reward: 1.67308304271
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.6730830427148315, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.67)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 6), heading: (0, -1), action: None, reward: 1.0147061578
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.0147061578021765, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.01)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 6), heading: (-1, 0), action: left, reward: 0.94155770308
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'left', 'reward': 0.9415577030803717, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 0.94)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: forward, reward: 1.78733924736
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 10, 't': 10, 'action': 'forward', 'reward': 1.787339247362433, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 1.79)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (8, 6), heading: (-1, 0), action: None, reward: 0.446733285922
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 9, 't': 11, 'action': None, 'reward': 0.4467332859223734, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 0.45)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (8, 5), heading: (0, -1), action: right, reward: 2.47152746416
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'left'), 'deadline': 8, 't': 12, 'action': 'right', 'reward': 2.471527464157762, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'left')
Agent followed the waypoint right. (rewarded 2.47)
35% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 419
\-------------------------
Environment.reset(): Trial set up with start = (8, 6), destination = (5, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0153; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (0, 1), action: right, reward: 1.66653055048
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'right', 'left'), 'deadline': 20, 't': 0, 'action': 'right', 'reward': 1.6665305504844554, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'right', 'left')
Agent followed the waypoint right. (rewarded 1.67)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: right, reward: 2.32240930995
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', 'left'), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 2.322409309946716, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', 'left')
Agent followed the waypoint right. (rewarded 2.32)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.80052286579
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 2.8005228657910224, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.80)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.08058948013
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 1.0805894801327025, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.08)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 1.17598946795
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.1759894679475416, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.18)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: 2.91291744351
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 2.91291744350703, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.91)
70% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 420
\-------------------------
Environment.reset(): Trial set up with start = (2, 5), destination = (6, 3), deadline = 30
Simulating trial. . .
epsilon = 0.0151; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 5), heading: (-1, 0), action: right, reward: 1.0422254349
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 1.0422254349003006, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.04)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: forward, reward: 1.70290313259
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 29, 't': 1, 'action': 'forward', 'reward': 1.7029031325877577, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 1.70)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: None, reward: 2.56142760023
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.561427600233519, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.56)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: None, reward: 1.70616331935
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.706163319350446, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.71)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 5), heading: (-1, 0), action: None, reward: 1.16617361822
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 26, 't': 4, 'action': None, 'reward': 1.1661736182157763, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.17)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 5), heading: (-1, 0), action: forward, reward: 1.12674872487
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 25, 't': 5, 'action': 'forward', 'reward': 1.126748724873344, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.13)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 5), heading: (-1, 0), action: forward, reward: 0.974394263202
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 0.9743942632016518, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.97)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: right, reward: 2.77959315617
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', None), 'deadline': 23, 't': 7, 'action': 'right', 'reward': 2.7795931561702236, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', None)
Agent followed the waypoint right. (rewarded 2.78)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: None, reward: -5.34066674722
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'left'}, 'violation': 1, 'light': 'green', 'state': ('forward', 'green', 'right', 'left'), 'deadline': 22, 't': 8, 'action': None, 'reward': -5.340666747219795, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'right', 'left')
Agent idled at a green light with no oncoming traffic. (rewarded -5.34)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: None, reward: 1.08144519035
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'left'), 'deadline': 21, 't': 9, 'action': None, 'reward': 1.081445190347419, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.08)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (6, 4), heading: (0, -1), action: None, reward: 0.938443828275
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 20, 't': 10, 'action': None, 'reward': 0.9384438282749024, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 0.94)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 3), heading: (0, -1), action: forward, reward: 2.08533404731
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 19, 't': 11, 'action': 'forward', 'reward': 2.085334047306103, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.09)
60% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Training trial 421
\-------------------------
Environment.reset(): Trial set up with start = (8, 3), destination = (4, 3), deadline = 20
Simulating trial. . .
epsilon = 0.0150; alpha = 0.2000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 3), heading: (0, -1), action: None, reward: 1.49857077339
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.4985707733908784, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.50)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 3), heading: (1, 0), action: right, reward: 1.42730595947
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 19, 't': 1, 'action': 'right', 'reward': 1.4273059594735118, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 1.43)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: forward, reward: 2.47449033206
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 2, 'action': 'forward', 'reward': 2.474490332057579, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.47)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.11266710913
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 3, 'action': None, 'reward': 2.1126671091324183, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.11)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.06664475939
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 16, 't': 4, 'action': None, 'reward': 2.066644759386633, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.07)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: None, reward: 2.84441376283
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 15, 't': 5, 'action': None, 'reward': 2.8444137628342165, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.84)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 2.1614224317
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 14, 't': 6, 'action': 'forward', 'reward': 2.161422431699644, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.16)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 1.06720037839
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 13, 't': 7, 'action': None, 'reward': 1.0672003783929498, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.07)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 2.61883668287
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 12, 't': 8, 'action': None, 'reward': 2.6188366828708443, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.62)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: 1.67210504284
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 11, 't': 9, 'action': 'forward', 'reward': 1.6721050428384312, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.67)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Testing trial 1
\-------------------------
Environment.reset(): Trial set up with start = (6, 5), destination = (1, 3), deadline = 25
Simulating trial. . .
epsilon = 0.0000; alpha = 0.0000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 5), heading: (0, 1), action: None, reward: 1.80670851779
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.8067085177853957, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.81)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 5), heading: (0, 1), action: None, reward: 2.40406689917
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.4040668991734844, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.40)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 5), heading: (0, 1), action: None, reward: 2.07091224839
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.070912248391359, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.07)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: left, reward: 1.89622437596
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 22, 't': 3, 'action': 'left', 'reward': 1.8962243759554982, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.90)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: forward, reward: 2.01968468189
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 2.0196846818899163, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.02)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: 1.64315394852
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.6431539485242606, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.64)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: None, reward: 2.81661063856
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'right', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.81661063855857, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.82)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (2, 5), heading: (1, 0), action: forward, reward: 1.65461683378
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'right'), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.6546168337840383, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'right')
Agent drove forward instead of left. (rewarded 1.65)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: left, reward: 1.70250502937
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 17, 't': 8, 'action': 'left', 'reward': 1.7025050293727249, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.70)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (2, 4), heading: (0, -1), action: None, reward: 2.77067110377
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', 'forward'), 'deadline': 16, 't': 9, 'action': None, 'reward': 2.7706711037657032, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 2.77)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 4), heading: (-1, 0), action: left, reward: 2.07652003773
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 15, 't': 10, 'action': 'left', 'reward': 2.076520037732448, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.08)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: right, reward: 2.65012010374
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 14, 't': 11, 'action': 'right', 'reward': 2.6501201037447197, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 2.65)
52% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Testing trial 2
\-------------------------
Environment.reset(): Trial set up with start = (2, 2), destination = (5, 5), deadline = 30
Simulating trial. . .
epsilon = 0.0000; alpha = 0.0000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 2.08744409798
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 30, 't': 0, 'action': None, 'reward': 2.0874440979792093, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.09)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 2.79702052645
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.797020526452239, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.80)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 2), heading: (0, 1), action: None, reward: 2.89518220617
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.895182206166993, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.90)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 2), heading: (1, 0), action: left, reward: 1.95760667839
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 27, 't': 3, 'action': 'left', 'reward': 1.957606678393694, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.96)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: forward, reward: 1.39394115673
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 26, 't': 4, 'action': 'forward', 'reward': 1.3939411567269882, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.39)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: None, reward: 2.92300652288
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 25, 't': 5, 'action': None, 'reward': 2.9230065228805975, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.92)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 2), heading: (1, 0), action: None, reward: 1.213760266
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 24, 't': 6, 'action': None, 'reward': 1.213760265998664, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.21)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: forward, reward: 2.42832138111
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 23, 't': 7, 'action': 'forward', 'reward': 2.428321381105754, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.43)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: 1.70026371078
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', None), 'deadline': 22, 't': 8, 'action': None, 'reward': 1.7002637107795153, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.70)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 2), heading: (1, 0), action: None, reward: 1.00537651156
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'forward', 'left'), 'deadline': 21, 't': 9, 'action': None, 'reward': 1.005376511564806, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'forward', 'left')
Agent properly idled at a red light. (rewarded 1.01)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: right, reward: 0.219989674362
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'left'), 'deadline': 20, 't': 10, 'action': 'right', 'reward': 0.219989674361734, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'left')
Agent drove right instead of left. (rewarded 0.22)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 1.57168827246
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 19, 't': 11, 'action': None, 'reward': 1.5716882724563346, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.57)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 2.06073514778
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 18, 't': 12, 'action': None, 'reward': 2.0607351477759717, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 2.06)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 1.39715258642
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 17, 't': 13, 'action': None, 'reward': 1.3971525864239194, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.40)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 1.20046844961
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 16, 't': 14, 'action': None, 'reward': 1.2004684496062459, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.20)
50% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: None, reward: 2.50905661684
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 15, 't': 15, 'action': None, 'reward': 2.5090566168411366, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.51)
47% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: forward, reward: 1.40368556068
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 14, 't': 16, 'action': 'forward', 'reward': 1.4036855606792973, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.40)
43% of time remaining to reach destination.
/-------------------
| Step 17 Results
\-------------------
Environment.step(): t = 17
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 1.1774542598
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 13, 't': 17, 'action': None, 'reward': 1.1774542597971016, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.18)
40% of time remaining to reach destination.
/-------------------
| Step 18 Results
\-------------------
Environment.step(): t = 18
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 2.34460895067
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 12, 't': 18, 'action': None, 'reward': 2.3446089506698238, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 2.34)
37% of time remaining to reach destination.
/-------------------
| Step 19 Results
\-------------------
Environment.step(): t = 19
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 2.58336576604
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 11, 't': 19, 'action': None, 'reward': 2.583365766044302, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.58)
33% of time remaining to reach destination.
/-------------------
| Step 20 Results
\-------------------
Environment.step(): t = 20
Environment.act() [POST]: location: (5, 4), heading: (0, 1), action: None, reward: 1.442627386
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'forward'), 'deadline': 10, 't': 20, 'action': None, 'reward': 1.4426273860008343, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'forward')
Agent properly idled at a red light. (rewarded 1.44)
30% of time remaining to reach destination.
/-------------------
| Step 21 Results
\-------------------
Environment.step(): t = 21
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 5), heading: (0, 1), action: forward, reward: 1.49365699902
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 9, 't': 21, 'action': 'forward', 'reward': 1.4936569990174233, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.49)
27% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Testing trial 3
\-------------------------
Environment.reset(): Trial set up with start = (6, 3), destination = (5, 6), deadline = 20
Simulating trial. . .
epsilon = 0.0000; alpha = 0.0000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: None, reward: 1.51471628286
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 20, 't': 0, 'action': None, 'reward': 1.5147162828579308, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.51)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: None, reward: 2.9018111155
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 19, 't': 1, 'action': None, 'reward': 2.9018111154954944, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.90)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (6, 3), heading: (1, 0), action: None, reward: 1.10003417302
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': None, 'reward': 1.1000341730168555, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.10)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (6, 2), heading: (0, -1), action: left, reward: 2.84666006582
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 17, 't': 3, 'action': 'left', 'reward': 2.8466600658213057, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 2.85)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: forward, reward: 1.56286937031
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', 'right'), 'deadline': 16, 't': 4, 'action': 'forward', 'reward': 1.5628693703093037, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', 'right')
Agent drove forward instead of left. (rewarded 1.56)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: None, reward: 1.486529843
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 15, 't': 5, 'action': None, 'reward': 1.4865298429970881, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.49)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (6, 7), heading: (0, -1), action: None, reward: 1.98968566773
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 14, 't': 6, 'action': None, 'reward': 1.9896856677307548, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 1.99)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: left, reward: 2.26439761918
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 13, 't': 7, 'action': 'left', 'reward': 2.264397619182991, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 2.26)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: None, reward: 1.6071336306
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, 'forward'), 'deadline': 12, 't': 8, 'action': None, 'reward': 1.6071336305968211, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.61)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: right, reward: 1.18812229985
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'left', None), 'deadline': 11, 't': 9, 'action': 'right', 'reward': 1.188122299845633, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'left', None)
Agent followed the waypoint right. (rewarded 1.19)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Testing trial 4
\-------------------------
Environment.reset(): Trial set up with start = (6, 5), destination = (1, 2), deadline = 30
Simulating trial. . .
epsilon = 0.0000; alpha = 0.0000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: right, reward: 1.50475291389
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 30, 't': 0, 'action': 'right', 'reward': 1.504752913885793, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 1.50)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 2.19994010246
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 29, 't': 1, 'action': None, 'reward': 2.19994010245691, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.20)
93% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 2.09820115251
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 28, 't': 2, 'action': None, 'reward': 2.098201152512968, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.10)
90% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 1.84344615444
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', None), 'deadline': 27, 't': 3, 'action': None, 'reward': 1.8434461544412621, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', None)
Agent properly idled at a red light. (rewarded 1.84)
87% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 1.08261137025
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 26, 't': 4, 'action': None, 'reward': 1.0826113702489213, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.08)
83% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 5), heading: (1, 0), action: None, reward: 1.53633986381
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 25, 't': 5, 'action': None, 'reward': 1.5363398638062864, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.54)
80% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: forward, reward: 1.73188370138
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 24, 't': 6, 'action': 'forward', 'reward': 1.7318837013811654, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.73)
77% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 1.92911644389
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 7, 'action': None, 'reward': 1.9291164438907484, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.93)
73% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (8, 5), heading: (1, 0), action: None, reward: 1.03910300515
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'right'), 'deadline': 22, 't': 8, 'action': None, 'reward': 1.0391030051473211, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'right')
Agent properly idled at a red light. (rewarded 1.04)
70% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (1, 5), heading: (1, 0), action: forward, reward: 2.22372361602
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 9, 'action': 'forward', 'reward': 2.2237236160230287, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.22)
67% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: right, reward: 1.93180682785
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'left'), 'deadline': 20, 't': 10, 'action': 'right', 'reward': 1.9318068278539837, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'left')
Agent followed the waypoint right. (rewarded 1.93)
63% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: None, reward: 0.977205184496
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 19, 't': 11, 'action': None, 'reward': 0.9772051844960645, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 0.98)
60% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (1, 6), heading: (0, 1), action: None, reward: 1.07172006564
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 18, 't': 12, 'action': None, 'reward': 1.0717200656438148, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.07)
57% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: forward, reward: 1.12318316932
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 13, 'action': 'forward', 'reward': 1.1231831693216554, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.12)
53% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 2), heading: (0, 1), action: forward, reward: 1.14409375777
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 16, 't': 14, 'action': 'forward', 'reward': 1.1440937577657004, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 1.14)
50% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Testing trial 5
\-------------------------
Environment.reset(): Trial set up with start = (3, 2), destination = (1, 5), deadline = 25
Simulating trial. . .
epsilon = 0.0000; alpha = 0.0000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: right, reward: 1.67290383868
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'forward', 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'forward', 'right'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.672903838676291, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'forward', 'right')
Agent followed the waypoint right. (rewarded 1.67)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 1.42543508826
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 24, 't': 1, 'action': None, 'reward': 1.4254350882554006, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.43)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 2), heading: (-1, 0), action: None, reward: 2.20041874919
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.2004187491902387, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.20)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (1, 2), heading: (-1, 0), action: forward, reward: 1.99636817283
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 22, 't': 3, 'action': 'forward', 'reward': 1.9963681728253497, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.00)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: right, reward: 1.4710583745
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 21, 't': 4, 'action': 'right', 'reward': 1.4710583745010817, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.47)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: None, reward: 2.11442141555
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'right', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'right', 'forward'), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.114421415545312, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'right', 'forward')
Agent properly idled at a red light. (rewarded 2.11)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (1, 7), heading: (0, -1), action: None, reward: 2.6725751957
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.672575195702927, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 2.67)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (1, 6), heading: (0, -1), action: forward, reward: 1.97819904134
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 1.9781990413378168, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 1.98)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (1, 5), heading: (0, -1), action: forward, reward: 1.4859637555
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 17, 't': 8, 'action': 'forward', 'reward': 1.485963755498907, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 1.49)
64% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Testing trial 6
\-------------------------
Environment.reset(): Trial set up with start = (1, 5), destination = (3, 7), deadline = 20
Simulating trial. . .
epsilon = 0.0000; alpha = 0.0000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 4), heading: (0, -1), action: forward, reward: 1.77080197339
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'forward'), 'deadline': 20, 't': 0, 'action': 'forward', 'reward': 1.7708019733949858, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'forward')
Agent drove forward instead of right. (rewarded 1.77)
95% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 3), heading: (0, -1), action: forward, reward: 0.936687223233
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', 'forward', 'forward'), 'deadline': 19, 't': 1, 'action': 'forward', 'reward': 0.9366872232331839, 'waypoint': 'right'}
Agent previous state: ('right', 'green', 'forward', 'forward')
Agent drove forward instead of right. (rewarded 0.94)
90% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 3), heading: (1, 0), action: right, reward: 2.07810265948
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 18, 't': 2, 'action': 'right', 'reward': 2.078102659482984, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.08)
85% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: forward, reward: 2.3003165304
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 17, 't': 3, 'action': 'forward', 'reward': 2.3003165304016284, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.30)
80% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 3), heading: (1, 0), action: None, reward: 1.74096226998
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 16, 't': 4, 'action': None, 'reward': 1.7409622699795988, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.74)
75% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 3), heading: (1, 0), action: forward, reward: 1.1106866297
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 15, 't': 5, 'action': 'forward', 'reward': 1.110686629697573, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 1.11)
70% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 4), heading: (0, 1), action: right, reward: 0.322349048799
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 14, 't': 6, 'action': 'right', 'reward': 0.3223490487992253, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 0.32)
65% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (3, 4), heading: (-1, 0), action: right, reward: 1.81506997434
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 13, 't': 7, 'action': 'right', 'reward': 1.8150699743432215, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.82)
60% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: right, reward: 1.26533081527
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'right'), 'deadline': 12, 't': 8, 'action': 'right', 'reward': 1.2653308152675895, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'right')
Agent followed the waypoint right. (rewarded 1.27)
55% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: None, reward: 1.36570273024
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, 'forward'), 'deadline': 11, 't': 9, 'action': None, 'reward': 1.3657027302406024, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, 'forward')
Agent properly idled at a red light. (rewarded 1.37)
50% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (3, 3), heading: (0, -1), action: None, reward: 2.26962940619
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 10, 't': 10, 'action': None, 'reward': 2.2696294061940883, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.27)
45% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: forward, reward: 1.9856833282
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 9, 't': 11, 'action': 'forward', 'reward': 1.985683328204723, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 1.99)
40% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (3, 2), heading: (0, -1), action: None, reward: 2.3842840568
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 8, 't': 12, 'action': None, 'reward': 2.3842840567953205, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.38)
35% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 7), heading: (0, -1), action: forward, reward: 2.38799768522
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 7, 't': 13, 'action': 'forward', 'reward': 2.3879976852230484, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.39)
30% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Testing trial 7
\-------------------------
Environment.reset(): Trial set up with start = (4, 5), destination = (6, 2), deadline = 25
Simulating trial. . .
epsilon = 0.0000; alpha = 0.0000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: None, reward: 1.00814215534
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 25, 't': 0, 'action': None, 'reward': 1.0081421553409615, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.01)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: None, reward: 2.7815603325
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, 'left'), 'deadline': 24, 't': 1, 'action': None, 'reward': 2.781560332496043, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, 'left')
Agent properly idled at a red light. (rewarded 2.78)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: None, reward: 2.75072823249
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 2.7507282324907045, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.75)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: None, reward: 1.9997652328
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.9997652327973352, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.00)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (4, 5), heading: (0, 1), action: None, reward: 2.15082213817
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', None, None), 'deadline': 21, 't': 4, 'action': None, 'reward': 2.1508221381712707, 'waypoint': 'left'}
Agent previous state: ('left', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.15)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: left, reward: 1.28810656947
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, None), 'deadline': 20, 't': 5, 'action': 'left', 'reward': 1.2881065694717362, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, None)
Agent followed the waypoint left. (rewarded 1.29)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 5), heading: (1, 0), action: None, reward: 1.77792083682
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.7779208368155344, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.78)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 5), heading: (1, 0), action: forward, reward: 2.07896182004
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'forward'), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.0789618200395776, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'forward')
Agent followed the waypoint forward. (rewarded 2.08)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 6), heading: (0, 1), action: right, reward: 1.61447263185
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', None, None), 'deadline': 17, 't': 8, 'action': 'right', 'reward': 1.6144726318507183, 'waypoint': 'right'}
Agent previous state: ('right', 'red', None, None)
Agent followed the waypoint right. (rewarded 1.61)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (6, 7), heading: (0, 1), action: forward, reward: 2.07640113518
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 2.0764011351843736, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 2.08)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (6, 2), heading: (0, 1), action: forward, reward: 2.12254861156
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 15, 't': 10, 'action': 'forward', 'reward': 2.1225486115553136, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.12)
56% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Testing trial 8
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (5, 5), deadline = 25
Simulating trial. . .
epsilon = 0.0000; alpha = 0.0000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (1, 7), heading: (0, 1), action: right, reward: 1.20114286859
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'forward', 'right'), 'deadline': 25, 't': 0, 'action': 'right', 'reward': 1.2011428685948617, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'forward', 'right')
Agent drove right instead of left. (rewarded 1.20)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: right, reward: 1.02525892988
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'right', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, 'forward'), 'deadline': 24, 't': 1, 'action': 'right', 'reward': 1.0252589298797488, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, 'forward')
Agent followed the waypoint right. (rewarded 1.03)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 1.45345626255
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'right', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 23, 't': 2, 'action': None, 'reward': 1.4534562625491154, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 1.45)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (8, 7), heading: (-1, 0), action: None, reward: 2.11591031167
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': None, 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', None, None), 'deadline': 22, 't': 3, 'action': None, 'reward': 2.1159103116689426, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', None, None)
Agent properly idled at a red light. (rewarded 2.12)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: forward, reward: 0.963974445215
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 0.9639744452154377, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 0.96)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 2.42799396128
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 20, 't': 5, 'action': None, 'reward': 2.427993961279784, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.43)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (7, 7), heading: (-1, 0), action: None, reward: 1.59584586975
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': 'right'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'right'), 'deadline': 19, 't': 6, 'action': None, 'reward': 1.595845869754893, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'right')
Agent properly idled at a red light. (rewarded 1.60)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: forward, reward: 2.4710759534
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'forward', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'forward', 'reward': 2.471075953397131, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.47)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act() [POST]: location: (6, 7), heading: (-1, 0), action: None, reward: 1.88033397075
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 17, 't': 8, 'action': None, 'reward': 1.8803339707549955, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.88)
64% of time remaining to reach destination.
/-------------------
| Step 9 Results
\-------------------
Environment.step(): t = 9
Environment.act() [POST]: location: (5, 7), heading: (-1, 0), action: forward, reward: 2.26009047105
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 16, 't': 9, 'action': 'forward', 'reward': 2.2600904710491365, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.26)
60% of time remaining to reach destination.
/-------------------
| Step 10 Results
\-------------------
Environment.step(): t = 10
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: right, reward: 2.21682674246
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('right', 'green', None, None), 'deadline': 15, 't': 10, 'action': 'right', 'reward': 2.216826742460316, 'waypoint': 'right'}
Agent previous state: ('right', 'green', None, None)
Agent followed the waypoint right. (rewarded 2.22)
56% of time remaining to reach destination.
/-------------------
| Step 11 Results
\-------------------
Environment.step(): t = 11
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: None, reward: 1.39184924127
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 14, 't': 11, 'action': None, 'reward': 1.3918492412666446, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 1.39)
52% of time remaining to reach destination.
/-------------------
| Step 12 Results
\-------------------
Environment.step(): t = 12
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: None, reward: 2.4513382052
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', 'forward'), 'deadline': 13, 't': 12, 'action': None, 'reward': 2.4513382051987973, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', 'forward')
Agent properly idled at a red light. (rewarded 2.45)
48% of time remaining to reach destination.
/-------------------
| Step 13 Results
\-------------------
Environment.step(): t = 13
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: None, reward: 1.56736351832
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 12, 't': 13, 'action': None, 'reward': 1.567363518318776, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.57)
44% of time remaining to reach destination.
/-------------------
| Step 14 Results
\-------------------
Environment.step(): t = 14
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: None, reward: 2.04385386469
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'left', 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 11, 't': 14, 'action': None, 'reward': 2.043853864693944, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.04)
40% of time remaining to reach destination.
/-------------------
| Step 15 Results
\-------------------
Environment.step(): t = 15
Environment.act() [POST]: location: (5, 6), heading: (0, -1), action: None, reward: 1.18425176272
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': 'right', 'left': 'left'}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', 'left'), 'deadline': 10, 't': 15, 'action': None, 'reward': 1.1842517627186493, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', 'left')
Agent properly idled at a red light. (rewarded 1.18)
36% of time remaining to reach destination.
/-------------------
| Step 16 Results
\-------------------
Environment.step(): t = 16
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 5), heading: (0, -1), action: forward, reward: 0.654813965717
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 9, 't': 16, 'action': 'forward', 'reward': 0.6548139657167396, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 0.65)
32% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Testing trial 9
\-------------------------
Environment.reset(): Trial set up with start = (1, 6), destination = (5, 3), deadline = 35
Simulating trial. . .
epsilon = 0.0000; alpha = 0.0000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (2, 6), heading: (1, 0), action: right, reward: 1.69182531362
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', 'left'), 'deadline': 35, 't': 0, 'action': 'right', 'reward': 1.6918253136165073, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', 'left')
Agent drove right instead of left. (rewarded 1.69)
97% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: forward, reward: 2.36651317289
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 34, 't': 1, 'action': 'forward', 'reward': 2.3665131728928523, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.37)
94% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (3, 6), heading: (1, 0), action: None, reward: 1.9690779163
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'forward', None), 'deadline': 33, 't': 2, 'action': None, 'reward': 1.9690779162956322, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'forward', None)
Agent properly idled at a red light. (rewarded 1.97)
91% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (4, 6), heading: (1, 0), action: forward, reward: 2.7036922543
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'forward', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'forward', None), 'deadline': 32, 't': 3, 'action': 'forward', 'reward': 2.7036922542993524, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'forward', None)
Agent followed the waypoint forward. (rewarded 2.70)
89% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (5, 6), heading: (1, 0), action: forward, reward: 2.56037942263
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'forward'), 'deadline': 31, 't': 4, 'action': 'forward', 'reward': 2.5603794226342, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'forward')
Agent followed the waypoint forward. (rewarded 2.56)
86% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (5, 7), heading: (0, 1), action: right, reward: 2.64246516849
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('right', 'red', 'left', None), 'deadline': 30, 't': 5, 'action': 'right', 'reward': 2.642465168489312, 'waypoint': 'right'}
Agent previous state: ('right', 'red', 'left', None)
Agent followed the waypoint right. (rewarded 2.64)
83% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (5, 2), heading: (0, 1), action: forward, reward: 2.31480649392
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 29, 't': 6, 'action': 'forward', 'reward': 2.3148064939189474, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 2.31)
80% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (5, 3), heading: (0, 1), action: forward, reward: 2.20297735856
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, None), 'deadline': 28, 't': 7, 'action': 'forward', 'reward': 2.2029773585553807, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, None)
Agent followed the waypoint forward. (rewarded 2.20)
77% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
/-------------------------
| Testing trial 10
\-------------------------
Environment.reset(): Trial set up with start = (7, 7), destination = (3, 6), deadline = 25
Simulating trial. . .
epsilon = 0.0000; alpha = 0.0000
/-------------------
| Step 0 Results
\-------------------
Environment.step(): t = 0
Environment.act() [POST]: location: (8, 7), heading: (1, 0), action: left, reward: 2.92474028154
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': None, 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 25, 't': 0, 'action': 'left', 'reward': 2.924740281537062, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.92)
96% of time remaining to reach destination.
/-------------------
| Step 1 Results
\-------------------
Environment.step(): t = 1
Environment.act() [POST]: location: (1, 7), heading: (1, 0), action: forward, reward: 2.02738008035
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', None, 'left'), 'deadline': 24, 't': 1, 'action': 'forward', 'reward': 2.027380080353492, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', None, 'left')
Agent followed the waypoint forward. (rewarded 2.03)
92% of time remaining to reach destination.
/-------------------
| Step 2 Results
\-------------------
Environment.step(): t = 2
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: forward, reward: 2.52144261453
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': 'left'}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', 'left'), 'deadline': 23, 't': 2, 'action': 'forward', 'reward': 2.5214426145342412, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', 'left')
Agent followed the waypoint forward. (rewarded 2.52)
88% of time remaining to reach destination.
/-------------------
| Step 3 Results
\-------------------
Environment.step(): t = 3
Environment.act() [POST]: location: (2, 7), heading: (1, 0), action: None, reward: 1.8569376206
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('forward', 'red', 'left', None), 'deadline': 22, 't': 3, 'action': None, 'reward': 1.8569376205957993, 'waypoint': 'forward'}
Agent previous state: ('forward', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 1.86)
84% of time remaining to reach destination.
/-------------------
| Step 4 Results
\-------------------
Environment.step(): t = 4
Environment.act() [POST]: location: (3, 7), heading: (1, 0), action: forward, reward: 0.97426757423
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('forward', 'green', 'left', None), 'deadline': 21, 't': 4, 'action': 'forward', 'reward': 0.9742675742303499, 'waypoint': 'forward'}
Agent previous state: ('forward', 'green', 'left', None)
Agent followed the waypoint forward. (rewarded 0.97)
80% of time remaining to reach destination.
/-------------------
| Step 5 Results
\-------------------
Environment.step(): t = 5
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: forward, reward: 1.21023528618
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'right', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'right', None), 'deadline': 20, 't': 5, 'action': 'forward', 'reward': 1.210235286183131, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'right', None)
Agent drove forward instead of left. (rewarded 1.21)
76% of time remaining to reach destination.
/-------------------
| Step 6 Results
\-------------------
Environment.step(): t = 6
Environment.act() [POST]: location: (4, 7), heading: (1, 0), action: None, reward: 2.34101956292
Environment.act(): Step data: {'inputs': {'light': 'red', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'red', 'state': ('left', 'red', 'left', None), 'deadline': 19, 't': 6, 'action': None, 'reward': 2.341019562917462, 'waypoint': 'left'}
Agent previous state: ('left', 'red', 'left', None)
Agent properly idled at a red light. (rewarded 2.34)
72% of time remaining to reach destination.
/-------------------
| Step 7 Results
\-------------------
Environment.step(): t = 7
Environment.act() [POST]: location: (4, 6), heading: (0, -1), action: left, reward: 1.12226818228
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': 'left', 'right': None, 'left': None}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', 'left', None), 'deadline': 18, 't': 7, 'action': 'left', 'reward': 1.122268182277309, 'waypoint': 'left'}
Agent previous state: ('left', 'green', 'left', None)
Agent followed the waypoint left. (rewarded 1.12)
68% of time remaining to reach destination.
/-------------------
| Step 8 Results
\-------------------
Environment.step(): t = 8
Environment.act(): Primary agent has reached destination!
Environment.act() [POST]: location: (3, 6), heading: (-1, 0), action: left, reward: 2.75271322018
Environment.act(): Step data: {'inputs': {'light': 'green', 'oncoming': None, 'right': 'left', 'left': 'forward'}, 'violation': 0, 'light': 'green', 'state': ('left', 'green', None, 'forward'), 'deadline': 17, 't': 8, 'action': 'left', 'reward': 2.752713220183832, 'waypoint': 'left'}
Agent previous state: ('left', 'green', None, 'forward')
Agent followed the waypoint left. (rewarded 2.75)
64% of time remaining to reach destination.
Trial Completed!
Agent reached the destination.
Simulation ended. . .
Before starting to work on implementing your driving agent, it's necessary to first understand the world (environment) which the Smartcab and driving agent work in. One of the major components to building a self-learning agent is understanding the characteristics about the agent, which includes how the agent operates. To begin, simply run the agent.py agent code exactly how it is -- no need to make any additions whatsoever. Let the resulting simulation run for some time to see the various working components. Note that in the visual simulation (if enabled), the white vehicle is the Smartcab.
In a few sentences, describe what you observe during the simulation when running the default agent.py agent code. Some things you could consider:
Hint: From the /smartcab/ top-level directory (where this notebook is located), run the command
'python smartcab/agent.py'
Answer: A. No, the Cab did NOT move. B. No rewards are set at this point. C. Can Not tell anything about the color versus rewards!!
In addition to understanding the world, it is also necessary to understand the code itself that governs how the world, simulation, and so on operate. Attempting to create a driving agent would be difficult without having at least explored the "hidden" devices that make everything work. In the /smartcab/ top-level directory, there are two folders: /logs/ (which will be used later) and /smartcab/. Open the /smartcab/ folder and explore each Python file included, then answer the following question.
agent.py Python file, choose three flags that can be set and explain how they change the simulation.environment.py Python file, what Environment class function is called when an agent performs an action?simulator.py Python file, what is the difference between the 'render_text()' function and the 'render()' function?planner.py Python file, will the 'next_waypoint() function consider the North-South or East-West direction first?Answer: A. For agent.py, the setable flags are "epsilon+ = the probability of a random action), "alpha" = the learning rate which should be less than 1.0, and the "learning" = 'False' or 'True'; B. For environment.py, the class function called for an action that is valid_actions = [None, 'forward', 'left', 'right'] C. For simulator.py, 'rendor_text()' is not GUI (not graphical <=> not using pygame) & 'render()' is GUI (is graphical output <=> using pygame). D. For planner,py, East-West is checked first then North-South in the code order.
The first step to creating an optimized Q-Learning driving agent is getting the agent to actually take valid actions. In this case, a valid action is one of None, (do nothing) 'left' (turn left), right' (turn right), or 'forward' (go forward). For your first implementation, navigate to the 'choose_action()' agent function and make the driving agent randomly choose one of these actions. Note that you have access to several class variables that will help you write this functionality, such as 'self.learning' and 'self.valid_actions'. Once implemented, run the agent file and simulation briefly to confirm that your driving agent is taking a random action each time step.
To obtain results from the initial simulation, you will need to adjust following flags:
'enforce_deadline' - Set this to True to force the driving agent to capture whether it reaches the destination in time.'update_delay' - Set this to a small value (such as 0.01) to reduce the time between steps in each trial.'log_metrics' - Set this to True to log the simluation results as a .csv file in /logs/.'n_test' - Set this to '10' to perform 10 testing trials.Optionally, you may disable to the visual simulation (which can make the trials go faster) by setting the 'display' flag to False. Flags that have been set here should be returned to their default setting when debugging. It is important that you understand what each flag does and how it affects the simulation!
Once you have successfully completed the initial simulation (there should have been 20 training trials and 10 testing trials), run the code cell below to visualize the results. Note that log files are overwritten when identical simulations are run, so be careful with what log file is being loaded! Run the agent.py file after setting the flags from projects/smartcab folder instead of projects/smartcab/smartcab.
# Load the 'sim_no-learning' log file from the initial simulation results
vs.plot_trials('C:\Users\dk2539\ML Nanodegree\Projects1\machine-learning-master\projects\smartcab\logs\sim_no-learning.csv')
Using the visualization above that was produced from your initial simulation, provide an analysis and make several observations about the driving agent. Be sure that you are making at least one observation about each panel present in the visualization. Some things you could consider:
Answer: A. "bad Actions" = Approximately 40% over the trials. B. Yes, random is not a smart way to drive a cab. C. Average reward = -6. This is likely not good. D. Outcome does NOT improve as number of trials increases. E. Both the Reliability & the Safety Ratings are "F" by the stated rubic.
The second step to creating an optimized Q-learning driving agent is defining a set of states that the agent can occupy in the environment. Depending on the input, sensory data, and additional variables available to the driving agent, a set of states can be defined for the agent so that it can eventually learn what action it should take when occupying a state. The condition of 'if state then action' for each state is called a policy, and is ultimately what the driving agent is expected to learn. Without defining states, the driving agent would never understand which action is most optimal -- or even what environmental variables and conditions it cares about!
Inspecting the 'build_state()' agent function shows that the driving agent is given the following data from the environment:
'waypoint', which is the direction the Smartcab should drive leading to the destination, relative to the Smartcab's heading.'inputs', which is the sensor data from the Smartcab. It includes 'light', the color of the light.'left', the intended direction of travel for a vehicle to the Smartcab's left. Returns None if no vehicle is present.'right', the intended direction of travel for a vehicle to the Smartcab's right. Returns None if no vehicle is present.'oncoming', the intended direction of travel for a vehicle across the intersection from the Smartcab. Returns None if no vehicle is present.'deadline', which is the number of actions remaining for the Smartcab to reach the destination before running out of time.Which features available to the agent are most relevant for learning both safety and efficiency? Why are these features appropriate for modeling the Smartcab in the environment? If you did not choose some features, why are those features not appropriate? Please note that whatever features you eventually choose for your agent's state, must be argued for here. That is: your code in agent.py should reflect the features chosen in this answer.
NOTE: You are not allowed to engineer new features for the smartcab.
Answer: A. The features that are important both efficiency and safety are epsilon, epsilon decay function & the alpha learning rate function. Also, the "inputs" are important for enumerating where the driving pitfalls exist: oncoming traffic combined with completing 'left' turns. B. The epsilon & its decay function allow for the learning random actions in the initial process to explore, but later in the trainning, the exporation is not so central to the process, the epsilon must be increasely a smaller positive proportion. The learning rate should a about .5 so the process does not learn to quickly ("over-learn"?) else missing the important 'knowledge'.
When defining a set of states that the agent can occupy, it is necessary to consider the size of the state space. That is to say, if you expect the driving agent to learn a policy for each state, you would need to have an optimal action for every state the agent can occupy. If the number of all possible states is very large, it might be the case that the driving agent never learns what to do in some states, which can lead to uninformed decisions. For example, consider a case where the following features are used to define the state of the Smartcab:
('is_raining', 'is_foggy', 'is_red_light', 'turn_left', 'no_traffic', 'previous_turn_left', 'time_of_day').
How frequently would the agent occupy a state like (False, True, True, True, False, False, '3AM')? Without a near-infinite amount of time for training, it's doubtful the agent would ever learn the proper action!
If a state is defined using the features you've selected from Question 4, what would be the size of the state space? Given what you know about the environment and how it is simulated, do you think the driving agent could learn a policy for each possible state within a reasonable number of training trials?
Hint: Consider the combinations of features to calculate the total number of states!
Answer: the states are: for 'light' there are 2 states = (red, green) for 'left' there are 4 states = (left, right, straight, none) for 'oncoming' from straight there are 2 states = (oncoming from straight, no oncoming straight) for 'oncoming' from right there are 2 states = (oncoming from right, no oncoming right) for 'oncoming' from left there are 2 states = (oncoming from left, no oncoming left)
2x4x2x2x2 = 64 states total states,; this analysis ignores the current state
For your second implementation, navigate to the 'build_state()' agent function. With the justification you've provided in Question 4, you will now set the 'state' variable to a tuple of all the features necessary for Q-Learning. Confirm your driving agent is updating its state by running the agent file and simulation briefly and note whether the state is displaying. If the visual simulation is used, confirm that the updated state corresponds with what is seen in the simulation.
Note: Remember to reset simulation flags to their default setting when making this observation!
The third step to creating an optimized Q-Learning agent is to begin implementing the functionality of Q-Learning itself. The concept of Q-Learning is fairly straightforward: For every state the agent visits, create an entry in the Q-table for all state-action pairs available. Then, when the agent encounters a state and performs an action, update the Q-value associated with that state-action pair based on the reward received and the iterative update rule implemented. Of course, additional benefits come from Q-Learning, such that we can have the agent choose the best action for each state based on the Q-values of each state-action pair possible. For this project, you will be implementing a decaying, $\epsilon$-greedy Q-learning algorithm with no discount factor. Follow the implementation instructions under each TODO in the agent functions.
Note that the agent attribute self.Q is a dictionary: This is how the Q-table will be formed. Each state will be a key of the self.Q dictionary, and each value will then be another dictionary that holds the action and Q-value. Here is an example:
{ 'state-1': {
'action-1' : Qvalue-1,
'action-2' : Qvalue-2,
...
},
'state-2': {
'action-1' : Qvalue-1,
...
},
...
}
Furthermore, note that you are expected to use a decaying $\epsilon$ (exploration) factor. Hence, as the number of trials increases, $\epsilon$ should decrease towards 0. This is because the agent is expected to learn from its behavior and begin acting on its learned behavior. Additionally, The agent will be tested on what it has learned after $\epsilon$ has passed a certain threshold (the default threshold is 0.05). For the initial Q-Learning implementation, you will be implementing a linear decaying function for $\epsilon$.
To obtain results from the initial Q-Learning implementation, you will need to adjust the following flags and setup:
'enforce_deadline' - Set this to True to force the driving agent to capture whether it reaches the destination in time.'update_delay' - Set this to a small value (such as 0.01) to reduce the time between steps in each trial.'log_metrics' - Set this to True to log the simluation results as a .csv file and the Q-table as a .txt file in /logs/.'n_test' - Set this to '10' to perform 10 testing trials.'learning' - Set this to 'True' to tell the driving agent to use your Q-Learning implementation.In addition, use the following decay function for $\epsilon$:
$$ \epsilon_{t+1} = \epsilon_{t} - 0.05, \hspace{10px}\textrm{for trial number } t$$
If you have difficulty getting your implementation to work, try setting the 'verbose' flag to True to help debug. Flags that have been set here should be returned to their default setting when debugging. It is important that you understand what each flag does and how it affects the simulation!
Once you have successfully completed the initial Q-Learning simulation, run the code cell below to visualize the results. Note that log files are overwritten when identical simulations are run, so be careful with what log file is being loaded!
# Load the 'sim_default-learning' file from the default Q-Learning simulation
vs.plot_trials('C:\Users\dk2539\ML Nanodegree\Projects1\machine-learning-master\projects\smartcab\logs\sim_default-learning.csv')
Using the visualization above that was produced from your default Q-Learning simulation, provide an analysis and make observations about the driving agent like in Question 3. Note that the simulation should have also produced the Q-table in a text file which can help you make observations about the agent's learning. Some additional things you could consider:
Answer: A. The "Default" is much better than the "Basic" Q-Learning agent as judged by the graphic deplay above compared the graphics above Question #3. B. Approximately 20 trials before testing. C. exp(-.6*num_trial) is the approximate epsilon function used (derived from reverse function) D. Yes, the number of 'bad' action decreased and the rewards did recrease. E. In both the basic and the default cases, the Reliability and Safety ranking were "F"s.
The third step to creating an optimized Q-Learning agent is to perform the optimization! Now that the Q-Learning algorithm is implemented and the driving agent is successfully learning, it's necessary to tune settings and adjust learning paramaters so the driving agent learns both safety and efficiency. Typically this step will require a lot of trial and error, as some settings will invariably make the learning worse. One thing to keep in mind is the act of learning itself and the time that this takes: In theory, we could allow the agent to learn for an incredibly long amount of time; however, another goal of Q-Learning is to transition from experimenting with unlearned behavior to acting on learned behavior. For example, always allowing the agent to perform a random action during training (if $\epsilon = 1$ and never decays) will certainly make it learn, but never let it act. When improving on your Q-Learning implementation, consider the implications it creates and whether it is logistically sensible to make a particular adjustment.
To obtain results from the initial Q-Learning implementation, you will need to adjust the following flags and setup:
'enforce_deadline' - Set this to True to force the driving agent to capture whether it reaches the destination in time.'update_delay' - Set this to a small value (such as 0.01) to reduce the time between steps in each trial.'log_metrics' - Set this to True to log the simluation results as a .csv file and the Q-table as a .txt file in /logs/.'learning' - Set this to 'True' to tell the driving agent to use your Q-Learning implementation.'optimized' - Set this to 'True' to tell the driving agent you are performing an optimized version of the Q-Learning implementation.Additional flags that can be adjusted as part of optimizing the Q-Learning agent:
'n_test' - Set this to some positive number (previously 10) to perform that many testing trials.'alpha' - Set this to a real number between 0 - 1 to adjust the learning rate of the Q-Learning algorithm.'epsilon' - Set this to a real number between 0 - 1 to adjust the starting exploration factor of the Q-Learning algorithm.'tolerance' - set this to some small value larger than 0 (default was 0.05) to set the epsilon threshold for testing.Furthermore, use a decaying function of your choice for $\epsilon$ (the exploration factor). Note that whichever function you use, it must decay to 'tolerance' at a reasonable rate. The Q-Learning agent will not begin testing until this occurs. Some example decaying functions (for $t$, the number of trials):
$$ \epsilon = a^t, \textrm{for } 0 < a < 1 \hspace{50px}\epsilon = \frac{1}{t^2}\hspace{50px}\epsilon = e^{-at}, \textrm{for } 0 < a < 1 \hspace{50px} \epsilon = \cos(at), \textrm{for } 0 < a < 1$$ You may also use a decaying function for $\alpha$ (the learning rate) if you so choose, however this is typically less common. If you do so, be sure that it adheres to the inequality $0 \leq \alpha \leq 1$.
If you have difficulty getting your implementation to work, try setting the 'verbose' flag to True to help debug. Flags that have been set here should be returned to their default setting when debugging. It is important that you understand what each flag does and how it affects the simulation!
Once you have successfully completed the improved Q-Learning simulation, run the code cell below to visualize the results. Note that log files are overwritten when identical simulations are run, so be careful with what log file is being loaded!
# Load the 'sim_improved-learning' file from the improved Q-Learning simulation
vs.plot_trials('C:\Users\dk2539\ML Nanodegree\Projects1\machine-learning-master\projects\smartcab\logs\sim_improved-learning2a.csv')
Using the visualization above that was produced from your improved Q-Learning simulation, provide a final analysis and make observations about the improved driving agent like in Question 6. Questions you should answer:
Answer:
A. The epsilon decay function I used is self.epsilon = math.exp(-.05*self.trial_num).
B. Approximately 400 trials for learning before testing.
C. I used the "epsilon tolerance" = .015 and "learning rate" = alpha = .5. Both are small and were recommendated in the Project Notes above.
D. The improvement was notable across the entire domain of trials for the one chosen compared the default, but the number of trials were greater in the chosen model.
E. Clearly the driving agent 'learned' successfully in the chosen model.
F. I am very satisfied with this exercise both the reliability & safety!!
Sometimes, the answer to the important question "what am I trying to get my agent to learn?" only has a theoretical answer and cannot be concretely described. Here, however, you can concretely define what it is the agent is trying to learn, and that is the U.S. right-of-way traffic laws. Since these laws are known information, you can further define, for each state the Smartcab is occupying, the optimal action for the driving agent based on these laws. In that case, we call the set of optimal state-action pairs an optimal policy. Hence, unlike some theoretical answers, it is clear whether the agent is acting "incorrectly" not only by the reward (penalty) it receives, but also by pure observation. If the agent drives through a red light, we both see it receive a negative reward but also know that it is not the correct behavior. This can be used to your advantage for verifying whether the policy your driving agent has learned is the correct one, or if it is a suboptimal policy.
Please summarize what the optimal policy is for the smartcab in the given environment. What would be the best set of instructions possible given what we know about the environment? You can explain with words or a table, but you should thoroughly discuss the optimal policy.
Next, investigate the 'sim_improved-learning.txt' text file to see the results of your improved Q-Learning algorithm. For each state that has been recorded from the simulation, is the policy (the action with the highest value) correct for the given state? Are there any states where the policy is different than what would be expected from an optimal policy?
Provide a few examples from your recorded Q-table which demonstrate that your smartcab learned the optimal policy. Explain why these entries demonstrate the optimal policy.
Try to find at least one entry where the smartcab did not learn the optimal policy. Discuss why your cab may have not learned the correct policy for the given state.
Be sure to document your state dictionary below, it should be easy for the reader to understand what each state represents.
Answer: A. By making slight changes to the epsilon & alpha, I believe I can achieve better reward scores and shorter completion durations. B. Are these policies? Yes, These are not what i had expected. C. 2 Optimal Policy Examples are: ('left', 'green', 'left', None) -- forward : 0.30 -- right : 0.53 -- None : -4.76 -- left : 1.75 so left with the highest reward would be the optimal choice
('left', 'green', 'forward', 'left') -- forward : 0.39 -- right : 0.80 -- None : 0.00 -- left : 0.00 so right with the highest reward would the optimal choice
D. All states have at least one possible reward so all states have an optimal policy (action).
'gamma'¶Curiously, as part of the Q-Learning algorithm, you were asked to not use the discount factor, 'gamma' in the implementation. Including future rewards in the algorithm is used to aid in propagating positive rewards backwards from a future state to the current state. Essentially, if the driving agent is given the option to make several actions to arrive at different states, including future rewards will bias the agent towards states that could provide even more rewards. An example of this would be the driving agent moving towards a goal: With all actions and rewards equal, moving towards the goal would theoretically yield better rewards if there is an additional reward for reaching the goal. However, even though in this project, the driving agent is trying to reach a destination in the allotted time, including future rewards will not benefit the agent. In fact, if the agent were given many trials to learn, it could negatively affect Q-values!
There are two characteristics about the project that invalidate the use of future rewards in the Q-Learning algorithm. One characteristic has to do with the Smartcab itself, and the other has to do with the environment. Can you figure out what they are and why future rewards won't work for this project?
Answer:
Note: Once you have completed all of the code implementations and successfully answered each question above, you may finalize your work by exporting the iPython Notebook as an HTML document. You can do this by using the menu above and navigating to
File -> Download as -> HTML (.html). Include the finished document along with this notebook as your submission.